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1 Introduction 

Multiple images, microlensing (with appreciable magnifications) and arcs in 
clusters are phenomena of strong lensing. In weak gravitational lensing, the 
Jacobi matrix A is very close to the unit matrix, which implies weak dis- 
tortions and small magnifications. Those cannot be identified in individual 
sources, but only in a statistical sense. Because of that, the accuracy of any 
weak lensing study will depend on the number of sources which can be used 
for the weak lensing analysis. This number can be made large either by having 
a large number density of sources, or to observe a large solid angle on the sky, 
or both. Which of these two aspects is more relevant depends on the specific 
application. Nearly without exception, the sources employed in weak lensing 
studies up to now are distant galaxies observed in the optical or near-IR pass- 
band, since they form the densest population of distant objects in the sky 
(which is a statement both about the source population in the Universe and 
the sensitivity of detectors employed in astronomical observations). To ob- 
serve large number densities of sources, one needs deep observations to probe 
the faint (and thus more numerous) population of galaxies. Faint galaxies, 
however, are small, and therefore their observed shape is strongly affected by 
the Point Spread Function, caused by atmospheric seeing (for ground-based 
observations) and telescope effects. These effects need to be well understood 
and corrected for, which is the largest challenge of observational weak lensing 
studies. On the other hand, observing large regions of the sky quickly leads 
to large data sets, and the problems associated with handling them. We shall 
discuss some of the most important aspects of weak lensing observations in 
Sect. 3. 

The effects just mentioned have prevented the detection of weak lensing 
effects in early studies with photographic plates (e.g., Tyson et al. 1984); they 
are not linear detectors (so correcting for PSF effects is not reliable), nor are 
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they sensitive enough for obtaining sufficiently deep images. Weak lensing re- 
search came through a number of observational and technical advances. Soon 
after the first giant arcs in clusters were discovered (see Sect, f .2 of Schnei- 
der, this volume; hereafter referred to as IN) by Soucail et al. (1987) and 
Lynds & Pctrosian (1989), Fort et al. (1988) observed objects in the lensing 
cluster Abell 370 which were less extremely stretched than the giant arc, but 
still showed a large axis ratio and was aligned in the direction tangent to its 
separation vector to the cluster center; they termed these images 'arclets'. 
Indeed, with the spectroscopic verification (Mellier et al. 1991) of the arclet 
A5 in A 370 being located at much larger distance from us than the lensing 
cluster, the gravitational lens origin of these arclets was proven. When the 
images of a few background galaxies are deformed so strongly that they can 
be identified as distorted by lensing, there should be many more galaxy im- 
ages where the distortion is much smaller, and where it can only be detected 
by averaging over many such images. Tyson et al. (1990) reported this sta- 
tistical distortion effect in two clusters, thereby initiating the weak lensing 
studies of the mass distribution of clusters of galaxies. This very fruitful field 
of research was put on a rigorous theoretical basis by Kaiser & Squires (1993) 
who showed that from the measurement of the (distorted) shapes of galaxies 
one can obtain a parameter-free map of the projected mass distribution in 
clusters. 

The flourishing of weak lensing in the past ten years was mainly due to 
three different developments. First, the potential of weak lensing was realized, 
and theoretical methods were worked out for using weak lensing measure- 
ments in a large number of applications, many of which will be described in 
later sections. This realization, reaching out of the lensing community, also 
slowly changed the attitude of time allocation committees, and telescope 
time for such studies was granted. Second, returning to the initial remark, 
one requires large fields-of-views for many weak lensing application, and the 
development of increasingly large wide-field cameras installed at the best as- 
tronomical sites has allowed large observational progress to be made. Third, 
quantitative methods for the correction of observations effects, like the blur- 
ring of images by the atmosphere and telescope optics, have been developed, 
of which the most frequently used one came from Kaiser et al. (1995). We 
shall describe this technique, its extensions, tests and alternative methods in 
Sect. 3.5. 

We shall start by describing the basics of weak lensing in Sect. 2, namely 
how the shear, or the projected tidal gravitational field of the intervening 
matter distribution can be determined from measuring the shapes of images 
of distant galaxies. Practical aspects of observations and the measurements 
of image shapes are discussed in Sect. 3. The next two sections are devoted 
to clusters of galaxies; in Sect. 4, some general properties of clusters are 
described, and their strong lensing properties are considered, whereas in Sect. 
5 weak lensing by clusters is treated. As already mentioned, this allows us 
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to obtain a parameter-free map of the projected (2-D) mass distribution of 
clusters. 

We then turn to lensing by the inhomogeneously distributed matter distri- 
bution in the Universe, the large-scale structure. Starting with Gunn (1967), 
the observation of the distortion of light bundles by the inhomogeneously dis- 
tributed matter in the Universe was realized as a unique probe to study the 
properties of the cosmological (dark) matter distribution. The theory of this 
cosmic shear effect, and its applications, was worked out in the early 1990's 
(e.g., Blandford et al. 1991). In contrast to the lensing situations studied in 
the rest of this book, here the deflecting mass is manifestly three-dimensional; 
we therefore need to generalize the theory of geometrically-thin mass distri- 
butions and consider the propagation of light in an inhomogeneous Universe. 
As will be shown, to leading order this situation can again be described in 
terms of an 'equivalent' surface mass density. The theoretical aspects of this 
large-scale structure lensing, or cosmic shear, are contained in Sect. 6. Al- 
though the theory of cosmic shear was well in place for quite some time, it 
took until the year 2000 before it was observationally discovered, indepen- 
dently and simultaneously by four groups. These early results, as well as the 
much more extensive studies carried out in the past few years, are presented 
and discussed in Sect. 7. In Sect. 8, we consider the weak lensing effects of 
galaxies, which can be used to investigate the mass profile of galaxies. As we 
shall see, this galaxy-galaxy lensing, first detected by Brainerd et al. (1996), 
is directly related to the connection between the galaxy distribution in the 
Universe and the underlying (dark) matter distribution; this lensing effect is 
therefore ideally suited to study the biasing of galaxies; we shall also describe 
alternative lensing effects for investigating the relation between luminous and 
dark matter. In the final Sect. 9 we discuss higher-order cosmic shear statis- 
tics and how lensing by the large-scale structure affects the lens properties 
of localized mass concentrations. Some final remarks are given in Sect. 10. 

Until very recently, weak lensing has been considered by a considerable 
fraction of the community as 'black magic' (or to quote one member of a PhD 
examination committee: "You have a mass distribution about which you don't 
know anything, and then you observe sources which you don't know either, 
and then you claim to learn something about the mass distribution?" ) . Most 
likely the reason for this is that weak lensing is indeed weak. One cannot 'see' 
the effect, nor can it be graphically displayed easily. Only by investigating 
many faint galaxy images can a signal be extracted from the data, and the 
human eye is not sufficient to perform this analysis. This is different even 
from the analysis of CMB anisotropics which, similarly, need to be analyzed 
by statistical means, but at least one can display a temperature map of the 
sky. However, in recent years weak lensing has gained a lot of credibility, 
not only because it has contributed substantially to our knowledge about 
the mass distribution in the Universe, but also because different teams, with 
different data set and different data analysis tools, agree on their results. 
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Weak lensing has been reviewed before; we shall mention only five ex- 
tensive reviews. Mcllicr (1999) provides a detailed compilation of the weak 
lensing results before 1999, whereas Bartelmann & Schneider (2001; hereafter 
BS01) present a detailed account of the theory and technical aspects of weak 
lensing. 1 More recent summaries of results can also be found in Wittman 
(2002) and Refregier (2003a), as well as the cosmic shear review by van 
Waerbeke & Mellier (2003). 

The coverage of topics in this review has been a subject of choice; no 
claim is made about completeness of subjects or references. In particular, 
due to the lack of time during the lectures, the topic of weak lensing of 
the CMB temperature fluctuations has not been covered at all, and is also 
not included in this written version. Apart from this increasingly important 
subject, I hope that most of the currently actively debated aspects of weak 
lensing are mentioned, and the interested reader can find her way to more 
details through the references provided. 

2 The principles of weak gravitational lensing 
2.1 Distortion of faint galaxy images 

Images of distant sources are distorted in shape and size, owing to the tidal 
gravitational field through which light bundles from these sources travel to us. 
Provided the angular size of a lensed image of a source is much smaller than 
the characteristic angular scale on which the tidal field varies, the distortion 
can be described by the linearized lens mapping, i.e., the Jacobi matrix A. 
The invariance of the surface brightness by gravitational light deflection, 
1(8) = 1^ \/3(8)], together with the locally linearized lens equation, 

(3 - (3 Q = A(8 ) ■ (8 - 8 ) , (1) 

where (3 = (3(8 q), then describes the distortion of small lensed images as 

I(d)=I^[(3 +A(d Q )-(8-d Q )]. (2) 

We recall (see IN) that the Jacobi matrix can be written as 

.4 W Hi-«>( 1 -fr* 1 )->- »<*>=^ < 3 > 

is the reduced shear, and the g a , a = 1,2, arc its Cartesian components. 
The reduced shear describes the shape distortion of images through gravita- 
tional light deflection. The (reduced) shear is a 2-component quantity, most 

1 We follow here the notation of BS01, except that we denote the angular diameter 
distance explicitly by D ang , whereas D is the comoving angular diameter distance, 
which we also write as fa, depending on the context; see Sect. 4.3 of IN for more 
details. In most cases, the distance ratio D^ s /D s is used, which is the same for 
both distance definitions. 
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conveniently written as a complex number, 

7 = 7i +172 = \l\e 2lip ; g = gi + ig 2 = \g\ e 2lip ; (4) 

its amplitude describes the degree of distortion, whereas its phase ip yields 
the direction of distortion. The reason for the factor '2' in the phase is the 
fact that an ellipse transforms into itself after a rotation by 180°. Consider a 
circular source with radius R (see Fig. 1); mapped by the local Jacobi matrix, 
its image is an ellipse, with semi-axes 

R R R R 



and the major axis encloses an angle tp with the positive #i-axis. Hence, 
if sources with circular isophotes could be identified, the measured image 
ellipticities would immediately yield the value of the reduced shear, through 
the axis ratio 

I I - 1 z h < a «. k = I_M 

151 l + b/a a l + 

and the orientation of the major axis ip. In these relations it was assumed 
that b < a, and \g\ < 1. We shall discuss the case \g\ > 1 later. 



convergence and 
shear 




e 



Fig. 1. A circular source, shown at the left, is mapped by the inverse Jacobian A^ 1 
onto an ellipse. In the absence of shear, the resulting image is a circle with modified 
radius, depending on k. Shear causes an axis ratio different from unity, and the 
orientation of the resulting ellipse depends on the phase of the shear (source: M. 
Bradac) 



However, faint galaxies are not intrinsically round, so that the observed 
image ellipticity is a combination of intrinsic ellipticity and shear. The strat- 
egy to nevertheless obtain an estimate of the (reduced) shear consists in 
locally averaging over many galaxy images, assuming that the intrinsic ellip- 
ticities are randomly oriented. In order to follow this strategy, one needs to 
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clarify first how to define 'ellipticity' for a source with arbitrary isophotes 
(faint galaxies are not simply elliptical); in addition, seeing by the atmo- 
spheric turbulence will blur and thus circularize - observed images, together 
with other effects related to the observation procedure. We will consider these 
issues in turn. 



2.2 Measurements of shapes and shear 

Definition of image ellipticities. Let 1(6) be the brightness distribution 
of an image, assumed to be isolated on the sky; the center of the image can 
be defined as 

Jd^i(0) qi [i(e)}e 

fd*ei(6) qi [i(6)] ' w 

where qi(I) is a suitably chosen weight function; e.g., if qi(I) = H(J — J t h), 
where H(a;) is the Heaviside step function, 9 would be the center of light 
within a limiting isophote of the image. We next define the tensor of second 
brightness moments, 

O JfOmOiim (6,-^(9,-9,) 

j^9i(e) qi [i(9)} ' ^ e ^' 2 >- w 

Note that for an image with circular isophotes, Qn = Q22, and Q12 = 0. 
The trace of Q describes the size of the image, whereas the traceless part of 
Qij contains the ellipticity information. From Qij, one defines two complex 
ellipticities, 

= gn - Q22 + 2iQi 2 = Qii-Q 2 2 + 2iQi 2 

X ~ O11 + O22 an £ " 0ll + 022 + 2(Q 11 Q 22 -0? 2) l/2 ' U 

Both of them have the same phase (because of the same numerator) , but a 
different absolute value. Fig. 2 illustrates the shape of images as a function 
of their complex ellipticity \- F° r an image with elliptical isophotes of axis 
ratio r < 1, one obtains 

1 - r 2 1 — r 

1x1 = ^ ; H = IT7- ^ 

Which of these two definitions is more convenient depends on the context; 
one can easily transform one into the other, 



1 + a-M 2 )" 2 1 i + H 

In fact, other (but equivalent) ellipticity definitions have been used in the lit- 
erature (e.g., Kochanek 1990; Miralda-Escude 1991; Bonnet & Mellier 1995), 
but the two given above appear to be most convenient. 
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Fig. 2. The shape of image ellipses 
for a circular source, in dependence 
on their two ellipticity components xi 
and X2; a corresponding plot in term 
of the ellipticity components e; would 
look quite similar. Note that the ellip- 
ticities are rotated by 90° when \ ~> 
— x (source: D. Clowe) 



From source to image ellipticities. In total analogy, one defines the 
second-moment brightness tensor Q\f , and the complex ellipticities x^ an d 
for the unlcnscd source. From 



fd*0lV(O) qi [I(>)(P)] 
one finds with d 2 /3 = dct^d 2 ^, 0-0 = A(O-O), that 

(i^ — .4, .4, - — »4, »4, ■ 



, i,je{i,2}, (10) 



(ii) 



where A = A{6). Using the definitions of the complex ellipticities, one finds 
the transformations (e.g., Schneider & Seitz 1995; Seitz & Schneider 1997) 



X 



X" 2g + g 2 x* 
l + \g\ 2 -21lc{gx*) ' 



c(B) 



1 - g*e 
k e* — 



if Iffl < 1 



if l<?l > 1 



(12) 



The inverse transformations are obtained by interchanging source and image 
ellipticities, and g — > — g in the foregoing equations. 



Estimating the (reduced) shear. In the following we make the assump- 
tion that the intrinsic orientation of galaxies is random, 

E ( x «) = = E (e«) , (13) 

which is expected to be valid since there should be no direction singled out in 
the Universe. This then implies that the expectation value of e is [as obtained 
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by averaging the transformation law (12) over the intrinsic source orientation] 

[g if|g|<i 

E(e) = (14) 

{ 1/9* if \g\ > i • 

This is a remarkable result (Schramm & Kaiser 1995; Seitz & Schneider 1997), 
since it shows that each image ellipticity provides an unbiased estimate of the 
local shear, though a very noisy one. The noise is determined by the intrinsic 
ellipticity dispersion 

in the sense that, when averaging over N galaxy images all subject to the 
same reduced shear, the 1-a deviation of their mean ellipticity from the true 
shear is a e /VN. A more accurate estimate of this error is 

<T = a e [l-min(\g\ 2 ,\g\~ 2 )]/VN (15) 

(Schneider et al. 2000). Hence, the noise can be beaten down by averaging 
over many galaxy images; however, the region over which the shear can be 
considered roughly constant is limited, so that averaging over galaxy images is 
always related to a smoothing of the shear. Fortunately, we live in a Universe 
where the sky is 'full of faint galaxies', as was impressively demonstrated by 
the Hubble Deep Field images (Williams et al. 1996) and previously from 
ultra-deep ground-based observations (Tyson 1987). Therefore, the accuracy 
of a shear estimate depends on the local number density of galaxies for which 
a shape can be measured. In order to obtain a high density, one requires 
deep imaging observations. As a rough guide, on a 3 hour exposure with a 
4-meter class telescope, about 30 galaxies per arcmin 2 can be used for a shape 
measurement. 

In fact, considering (14) we conclude that the expectation value of the 
observed ellipticity is the same for a reduced shear g and for g' — 1/g*. 
Schneider & Seitz (1995) have shown that one cannot distinguish between 
these two values of the reduced shear from a purely local measurement, and 
term this fact the 'local degeneracy'; this also explains the symmetry between 
\g\ and |<7| in (15). Hence, from a local weak lensing observation one can- 
not tell the case \g\ < 1 (equivalent to det^4 > 0) from the one of |g| > 1 
or det^4 < 0. This local degeneracy is, however, broken in large-field obser- 
vations, as the region of negative parity of any lens is small (the Einstein 
radius inside of which \g\ > 1 of massive lensing clusters is typically ^ 30", 
compared to data fields of several arcminutes used for weak lensing studies 
of clusters), and the reduced shear must be a smooth function of position on 
the sky. 

Whereas the transformation between source and image ellipticity appears 
simpler in the case of \ than e - see (12), the expectation value of \ cannot be 
easily calculated and depends explicitly on the intrinsic ellipticity distribution 
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of the sources. In particular, the expectation value of \ is n °t simply related 
to the reduced shear (Schneider & Seitz 1995). However, in the weak lensing 
regime, k«1, I7I <C 1, one finds 

7 « 5 «< e )«M. (is) 



2.3 Tangential and cross component of shear 

Components of the shear. The shear components 71 and 72 arc defined 
relative to a reference Cartesian coordinate frame. Note that the shear is not 
a vector (though it is often wrongly called that way in the literature) , owing 
to its transformation properties under rotations: Whereas the components 
of a vector are multiplied by cos ip and sin ip when the coordinate frame is 
rotated by an angle ip, the shear components are multiplied by cos(2ip) and 
sin(2c/?), or simply, the complex shear gets multiplied by e~ 2lv . The reason for 
this transformation behavior of the shear traces back to its original definition 
as the traceless part of the Jacobi matrix A. This transformation behavior is 
the same as that of the linear polarization; the shear is therefore a polar. In 
analogy with vectors, it is often useful to consider the shear components in 
a rotated reference frame, that is, to measure them w.r.t. a different direc- 
tion; for example, the arcs in clusters are tangentially aligned, and so their 
cllipticity is oriented tangent to the radius vector in the cluster. 
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Fig. 3. Illustration of the tangen- 
tial and cross-components of the 
shear, for an image with ei = 0.3, 
62 = 0, and three different direc- 
tions 4> with respect to a reference 
point (source: M. Bradac) 



If 4> specifies a direction, one defines the tangential and cross components 
of the shear relative to this direction as 

7t = -fte[ 7 e- 2i *] , 7 x =-Im[ 7 e- 2i *] ; (17) 

For example, in case of a circularly-symmetric matter distribution, the shear 
at any point will be oriented tangent to the direction towards the center 
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of symmetry. Thus in this case choose <j> to be the polar angle of a point; 
then, 7>< = 0. In full analogy to the shear, one defines the tangential and 
cross components of an image ellipticity, e t and e x . An illustration of these 
definitions is provided in Fig. 3. 

The sign in (17) is easily explained (and memorized) as follows: consider 
a circular mass distribution and a point on the 6*i-axis outside the Einstein 
radius. The image of a circular source there will be stretched in the direction 
of the #2-axis. In this case, 4> = in (17), the shear is real and negative, and 
in order to have the tangential shear positive, and thus to define tangential 
shear in accordance with the intuitive understanding of the word, a minus 
sign is introduced. Negative tangential ellipticity implies that the image is 
oriented in the radial direction. We warn the reader that sign conventions and 
notations have undergone several changes in the literature, and the current 
author had his share in this. 



Minimum lens strength for its weak lensing detection. As a first 
application of this decomposition, we consider how massive a lens needs to 
be in order that it produces a detectable weak lensing signal. For this purpose, 
consider a lens modeled as an SIS with one-dimensional velocity dispersion 
g v . In the annulus 8 ln < 9 < 9 out , centered on the lens, let there be N galaxy 
images with positions 6i — 8i(cos cf>i, sin and (complex) ellipticities e^. For 
each one of them, consider the tangential ellipticity 

e ti = -TZe(e i e- 2i ' t>i ) . (18) 

The weak lensing signal-to-noise for the detection of the lens obtained by 
considering a weighted average over the tangential ellipticity is (see BS01, 
Sect. 4.5) 



S _ 6>e 
N ~ V e 



7m \An(0 ou t/0 in ) 



1/2 

30arcmin" 2 / V0.3/ V 600kms~ 



-■'(..„ . ,) (rr-) 1 :) 



ln(0outMn)V /2 /Al 



In 10 J \ D, 



where 9e = 47r(<7„/c) (Dd s /D s ) is the Einstein radius of an SIS, n the mean 
number density of galaxies, and the average of the distance ratio is taken 
over the source population from which the shear measurements are obtained. 
Hence, the S/N is proportional to the lens strength (as measured by 0e), 
the square root of the number density, and inversely proportional to a e , as 
expected. From this consideration we conclude that clusters of galaxies with 
u v 600km/s can be detected with sufficiently large S/N by weak lensing, 
but individual galaxies (a v ^ 200km/s) are too weak as lenses to be detected 
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individually. Furthermore, the final factor in (19) implies that, for a given 
source population, the cluster detection will be more difficult for increasing 
lens redshift. 



Mean tangential shear on circles. In the case of axi-symmetric mass 
distributions, the tangential shear is related to the surface mass density k(9) 
and the mean surface mass density R{9) inside the radius 9 by j t — R — k, as 
can be easily shown by the relation in Sect. 3.1 of IN. It is remarkable that 
a very similar expression holds for general matter distributions. To see this, 
we start from Gauss' theorem, which states that 

j d 2 tf V • Vtp = 9 j> dip W • n , 

where the integral on the left-hand side extends over the area of a circle of 
radius 9 (with its center chosen as the origin of the coordinate system), ip is 
an arbitrary scalar function, the integral on the right extends over the circle 
with radius 9, and n is the outward directed normal on this circle. Taking tp 
to be the deflection potential and noting that V 2 ip — 2k, one obtains 

m{ 9) = \j\^K { ») = lJ^f e , (20) 

where we used that \7ip ■ n = ipj. Differentiating this equation with respect 
to 9 yields 

dm m 9 1' d 2 it> .„ . 

+ ^<M^-|. (21) 



d9 9 2tt J r d9 2 
Consider a point on the #i-axis; there, ip y gg = tpn = n + 71 = k — 7t. This 
last expression is independent on the choice of coordinates and must therefore 
hold for all ip. Denoting by (k(9)) and (jt{9)} the mean surface mass density 
and mean tangential shear on the circle of radius 9, (21) becomes 

^ = ^+,9 [<«(*)> -< 7t (*)>] . (22) 

The dimensionless mass m(9) in the circle is related to the mean surface mass 
density inside the circle R{9) by 



m{9) = 9 2 R(9) = 2 di9 $ (k(i?)) . (23) 
Jo 

Together with dm/d9 = 29 (k(8)), (22) becomes, after dividing through 9, 

( 7t > =«-<«), (24) 

a relation which very closely matches the result mentioned above for axi- 
symmetric mass distributions (Bartelmann 1995). One important immediate 
implication of this result is that from a measurement of the tangential shear, 
averaged over concentric circles, one can determine the azimuthally-averaged 
mass profile of lenses, even if the density is not axi-symmetric. 
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2.4 Magnification effects 

Recall from IN that a magnification \x changes source counts according to 

^^•^M^)^^' 2 ) ' (25) 

where n(> S, z) and n (> S, z) are the lensed and unlensed cumulative num- 
ber densities of sources, respectively. The first argument of no accounts for 
the change of the flux (which implies that a magnification fi > 1 allows the 
detection of intrinsically fainter sources), whereas the prefactor in (25) stems 
from the change of apparent solid angle. In the case that no(S) oc S~ a , this 
yields 

(26) 

n (> b) 

and therefore, if a > 1 (< 1), source counts are enhanced (depleted); the 
steeper the counts, the stronger the effect. In the case of weak lensing, where 
l/j, — 1| <C 1, one probes the source counts only over a small range in flux, 
so that they can always be approximated (locally) by a power law. Provided 
that k <§; 1, I7I <C 1, a further approximation applies, 

^1 + 2k; and S } w 1 + 2(a - 1)k . (27) 

n (> b) 

Thus, from a measurement of the local number density n(> S 1 ) of galaxies, k 
can in principle be inferred directly. It should be noted that a <~ 1 for galax- 
ies in the B-band, but in redder bands, a < 1 (e.g., Ellis 1997); therefore, 
one expects a depletion of their counts in regions of magnification \i > 1. 
Broadhurst et al. (1995) have discussed in detail the effects of magnification 
in weak lensing. Not only are the number counts affected, but since this is 
a redshift-dependent effect (since both k and 7 depend, for a given physi- 
cal surface mass density, on the source redshift), the redshift distribution of 
galaxies is locally changed by magnification. 

Since magnification is merely a stretching of solid angle, Bartelmann & 
Narayan (1995) pointed out that magnified images at fixed surface bright- 
ness have a larger solid angle than unlensed ones; in addition, the sur- 
face brightness of a galaxy is expected to be a strong function of redshift 
[I oc (1 + z)~ 4 ], owing to the Tolman effect. Hence, if this effect could be 
harnessed, a (redshift-dependent) magnification could be measured statisti- 
cally. Unfortunately, this method is hampered by observational difficulties; 
it seems that estimating a reliable estimate for the surface brightness from 
seeing-convolved images (see Sect. 3.5) is even more difficult than determining 
image shapes. 
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Fig. 4. The size of galaxies ob- 
served with the ACS camera on- 
board HST. Small dots denote 
the half-light radius of individual 
galaxies, bigger points with er- 
ror bars show the mean size in 
a magnitude bin. The horizon- 
tal line of point at rh ~ 0'.'08 
correspond to stellar images in 
the ACS fields, as they have all 
the same size but vary in magni- 
tude, and points at even smaller 
size are noise artefacts which are 
not used for any lensing analysis 
(source: T. Schrabback) 



3 Observational issues and challenges 

Weak lensing, employing the shear method, relies on the shape measurements 
of faint galaxy images. Since the noise due to intrinsic ellipticity dispersion is 
oc a f J y/n, one needs a high number density n to beat this noise component 
down. However, the only way to increase the number density of galaxies is 
to observe to fainter magnitudes. As it turns out, galaxies at faint magni- 
tudes are small, in fact typically smaller than the size of the point-spread 
function (PSF), or the seeing disk (see Fig. 4). Hence, for them one needs 
usually large correction factors between the true ellipticity and that of the 
seeing-convolved image. On the other hand, fainter galaxies tend to probe 
highcr-rcdshift galaxies, which increases the lensing signal due to D<± s /D s - 
dependence of the 'lensing efficiency'. 

3.1 Strategy 

In the present observational situation, only the optical sky is densely popu- 
lated with sources; therefore, weak lensing observations are performed with 
optical (or near-IR) CCD-cameras (photometric plates are not linear enough 
to measure these subtle effects). In order to substantiate this comment, note 
that the Hubble Deep Field North contains about 3000 galaxies, but only 
seven radio sources are detected in a very deep integration with the VLA 
(Richards et al. 1998). 2 In order to obtain a high number density of sources, 

2 The source density on the radio sky will become at least comparable to that 
currently on the optical sky with the future Square Kilometer Array (SKA). 
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long exposures are needed: as an illustrative example, to get a number den- 
sity of useful galaxies (i.e., those for which a shape can be measured reliably) 
of n ~ 20 arcmin~ 2 , one needs ~ 2 hours integration on a 4-m class telescope 
in good seeing a ^ 1". 

Furthermore, large solid angles are desired, either to get large areas around 
clusters for their mass reconstruction, or to get good statistics of lenses on 
blank field surveys, such as they are needed for galaxy-galaxy lensing and cos- 
mic shear studies. It is now possible to cover large area in reasonable amounts 
of observing time, since large format CCD cameras have recently become 
available; for example, the Wide-Field Imager (WFI) at the ESO/MPG 2.2- 
m telescope at La Silla has (8K) 2 pixels and covers an area of <~ (0.5 deg) 2 . 
Until recently, the CFH12K camera with 8K x 12K pixels and field ~ 30' x 45' 
was mounted at the Canada-French-Hawaii Telescope (CFHT) on Mauna Kea 
and was arguably the most efficient wide-field imaging instrument hitherto. 
In 2003, MegaCam has been put into operation on the CFTH which has 
(18K) 2 pixels and covers <~ 1 deg 2 . Several additional cameras of comparable 
size will become operational in the near future, including the 1 deg 2 instru- 
ment OmegaCAM on the newly built VLT Survey Telescope on Paranal. The 
largest field camera on a 10-m class telescope is SuprimeCAM, a 34' x 27' 
multi-chip camera on the Subaru 8.2-meter telescope. Unfortunately, many 
optical astronomers (and decision making panels of large facilities) consider 
the prime use of large telescopes to be spectroscopy; for example, although 
the four ESO VLT unit telescopes are equipped with a total of ten instru- 
ments, the largest imagers on the VLT are the two FORS instruments, with 
a ~ 6'.7 field-of-view. 3 

The typical pixel size of these cameras is <~ Of! 2, which is needed to sample 
the seeing disk in times of good seeing. From Fig. 4 one concludes immediately 
that the seeing conditions are absolutely critical for weak lensing: an image 
with Qf.'6 is substantially more useful than one with taken under the more 
typical condition of 0'.'8 (see Fig. 5). There are two separate reasons why the 
seeing is such an important factor. First, seeing blurs the images and make 
them rounder; accordingly, to correct for the seeing effect, a larger correction 
factor is needed in the worse seeing conditions. In addition, since the galaxy 
images from which the shear is to be determined are faint, a larger seeing 
smears the light from these galaxies over a larger area on the sky, reducing 
its contrast relative to the sky noise, and therefore leads to noisier estimates 
of the ellipticities even before the correction. 

3 Nominally, the VIMOS instrument has a four times larger f.o.v., but our analysis 
of early VIMOS imaging data indicates that it is totally useless for weak lensing 
observations, owing to its highly anisotropic PSF, which even seems to show 
discontinuities on chips, and its large variation of the seeing size across chips. It 
may be hoped that some of these image defects are improved after a complete 
overhaul of the instrument which occurred recently. 
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Fig. 5. Mean number density of galaxy images for which a shape can be measured 
(upper row) and the r.m.s. noise of a shear measurement in an area of 1 arcmin 2 as 
a function of the full width at half maximum (FWHM) of the point-spread function 
(PSF) - i.e., the seeing. The data were taken on 20 different fields with the FORS2 
instrument at the VLT, with different filters (I, R, V and R). Squares show data 
taken with about 2 hours integration time, circles those with ~ 45 min exposure. 
The right-most panels show the coadded data of I,R,V for the long exposures, and 
1,V,B for the 45 min fields. The useful number of galaxy images is seen to be a strong 
function of the seeing, except for the I-band (which is related to the higher sky 
brightness and the way objects are detected). But even more dramatically, the noise 
due to intrinsic source ellipticity decreases strongly for better seeing conditions, 
which is due to (1) higher number density of galaxies for which a shape can be 
measured, and (2) smaller corrections for PSF blurring, reducing the associated 
noise of this correction. In fact, this figure shows that seeing is a more important 
quantity than the total exposure time (from Clowe et al. 2004b) 



Deep observations of a field require multiple exposures. As a characteristic 
number, the exposure time for an R-band image on a 4-m class telescope is 
not longer than ~ 10 min to avoid the non-linear part of the CCD sensitivity 
curve (exposures in shorter wavelength bands can be longer, since the night 
sky is fainter in these filters). Therefore, these large- format cameras imply a 
high data rate; e.g., one night of observing with the WFI yields <~ 30 GB of 
science and calibration data. This number will increase by a factor ~ 6 for 
MegaCam. Correspondingly, handling this data requires large disk space for 
efficient data reduction. 
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3.2 Data reduction: Individual frames 

We shall now consider a number of issues concerning the reduction of imaging 
data, starting here with the steps needed to treat individual chips on indi- 
vidual frames, and later consider aspects of combining them into a coadded 
image. 

Flatfielding. The pixels of a CCD have different sensitivity, i.e., they yield 
different counts for a given amount of light falling onto them. In order to 
calibrate the pixel sensitivity, one needs flatfielding. Three standard methods 
for this are in use: 

1. Dome-flats: a uniformly illuminated screen in the telescope dome is ex- 
posed; the counts in the pixels are then proportional to their sensitivity. 
The problem here is that the screen is not really of uniform brightness. 

2. Twilight-flats: in the period of twilight after sunset, or before sunrise, the 
cloudless sky is nearly uniformly bright. Short exposures of regions of the 
sky without bright stars are then used to calibrate the pixel sensitivity. 

3. Superflats: if many exposures with different pointings are taken with a 
camera during a night, then any given pixel is not covered by a source for 
most of the exposures (because the fraction of the sky at high galactic 
latitudes which is covered by objects is fairly small, as demonstrated by 
the deep fields taken by the HST). Hence, the (exposure-time normalized) 
counts of any pixel will show, in addition to a little tail due to those ex- 
posures when a source has covered it, a distribution around its sensitivity 
to the uniform night-sky brightness; from that distribution, the flat-field 
can be constructed, by taking its mode or its median. 



Bad pixels. Each CCD has defects, in that some pixels are dead or show 
a signal unrelated to their illumination. This can occur as individual pixels, 
or whole pixel columns. No information of the sky image is available at these 
pixel positions. One therefore employs dithering: several exposures of the 
same field, but with slightly different pointings (dither positions) are taken. 
Then, any position of the field falls on bad pixels only in a small fraction 
of exposures, so that the full two-dimensional brightness distribution can be 
recovered. 

Cosmic rays. Those mimic groups of bad pixels; they can be removed owing 
to the fact that a given point of the image will most likely be hit by a cosmic 
only once, so that by comparison between the different exposures, cosmic rays 
can be removed (or more precisely, masked). Another signature of a cosmic 
ray is that the width of its track is typically much smaller than the seeing 
disk, the minimum size of any real source. 
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Fig. 6. A flat field for the CFH12K camera, showing the sensitivity variations 
between pixels and in particular between chips. Also, bad columns are clearly seen 



Bright stars. Those cause large diffraction spikes, and depending on the 
optics and the design of the camera, reflection rings, ghost images and other 
unwanted features. It is therefore best to choose fields where no or very few 
bright stars are present. The diffraction spikes of stars need to be masked, as 
well as the other features just mentioned. 

Fringes. Owing to light reflection within the CCD, patterns of illumination 
across the field can be generated (see Fig. 8); this is particularly true for thin 
chips when rather long wavelength filters are used. In clear nights, the fringe 
pattern is stable, i.e., essentially the same for all images taken during the 
night; in that case, it can be deduced from the images and subtracted off 
the individual exposures. However, if the nights are not clear, this procedure 
no longer works well; it is then safer to observe at shorter wavelength. For 
example, for the WFI, fringing is a problem for I-band images, but for the 
R-band filter, the amplitude of fringing is small. For the FORS instruments 
at the VLT, essentially no fringing occurs even in the I band (Maoli et al. 
2001). 

Gaps. The individual CCDs in multi-chip cameras cannot be brought to- 
gether arbitrarily close; hence, there are gaps between the CCDs (see Fig. 9 
for an example). In order to cover the gaps, the dither pattern can be chosen 
such as to cover the gaps, so that they fall on different parts of the sky in 
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Fig. 7. A raw frame from the CFH12K camera, showing quite a number of ef- 
fects mentioned in the text: bad column, saturation of bright stars, bleeding, and 
sensitivity variations across the field and in particular between chips 



different exposures. As we shall see, such relatively large dither patterns also 
provide additional advantages. 

Satellite trails, asteroid trails. Those have to be identified, cither by 
visual inspection (currently the default) or by image recognition software 
which can detect these linear features which occur cither only once, or at 
different positions on different exposures. These are then masked, in the same 
way as some of the other features mentioned above. 

3.3 Data reduction: coaddition 

After taking several exposures with slightly different pointing positions (for 
the reasons given above), frames shall be coadded to a sum-frame; some of 
the major steps in this coaddition procedure are: 

Astrometric solution. One needs to coadd data from the same true (or 
sky) position, not the same pixel position. Therefore, one needs a very precise 
mapping from sky coordinates to pixel coordinates. Field distortions, which 
occur in every camera (and especially so in wide-field cameras), make this 
mapping non- linear (see Fig. 10). Whereas the distortion map of the tele- 
scope/camera system is to a large degree constant and therefore one of the 
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Fig. 8. The two left panels show the fringe patterns of images taken with the WFI 
in the I-band; the upper one was taken during photometric conditions, the lower 
one under non-photometric conditions. Since the fringe pattern is spatially stable, 
it can be corrected for (left panels) , but the result is satisfactory only in the former 
case (source: M. Schirmer & T. Erben) 
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Fig. 9. Layout of the Wide Field 
Imager (WFI) at the ESO/MPG 
2.2m telescope at La Silla. The 
eight chips each have ~ 2048 x 
4096 pixels and cover ~ 7'.5 x 15' 



20 P. Schneider 



known features, it is not stable to the sub-pixel accuracy needed for weak lens- 
ing work, owing to its dependence on the zenith angle (geometrical distortions 
of the telescope due to gravity), temperature etc. Therefore, the pixel-to-sky 
mapping has to be obtained from the data itself. Two methods are used to 
achieve this: one of them makes use of an external reference catalog, such 
as the US Naval Observatory catalogue for point sources; it contains about 
2 point sources per arcmin 2 (at high Galactic latitudes) with ~ 0.3arcsec 
positional accuracy. Matching point sources on the exposures with those in 
the USNO catalog therefore yields the mapping with sub-arcsecond accuracy. 
Far higher accuracy of the relative astrometry is achieved (and needed) from 
internal astrometry, which is obtained by matching objects which appear 
at different pixel coordinates, and in particular, on different CCDs for the 
various dithering positions. Whereas the sky coordinates are constant, the 
pixel coordinates change between dithering positions. Since the distortion 
map can be described by a low-order polynomial, the comparison of many 
objects appearing at (substantially) different pixel positions yield many more 
constraints than the free parameters in the distortion map and thus yields 
the distortion map with much higher relative accuracy than external data. 
The corresponding astrometric solution can routinely achieve an accuracy of 
0.1 pixel, or typically (X'02 - compared with a typical field size of ~ 30'. 

Photometric solution. Flatfielding corrects for the different sensitivities 
of the pixels and therefore yields accurate relative photometry across indi- 
vidual exposures. The different exposures are tied together by matching the 
brightness of joint objects, in particular across chip boundaries. To achieve 
an absolute photometric calibration, one needs external data (e.g., standard 
star observations). 

The coaddition process. Coaddition has to happen with sub-pixel accu- 
racy; hence, one cannot just shift pixels from different exposures on top of 
each other, although this procedure is still used by some groups. The by-now 
standard method is drizzling (Fruchter & Hook 2002), in which a new pixel 
frame is defined which usually has smaller pixel size than the original image 
pixels (typically by a factor of two) and which is linearly related to the sky 
coordinates. The astrometrically and photometrically calibrated individual 
frames are now remapped onto this new pixel grid, and the pixel values are 
summed up into the sub-pixel grid, according to the overlap area between ex- 
posure pixel and drizzle pixel (see Fig. 11). By that, drizzling automatically 
is flux conserving. In the coaddition process, weights are assigned, accounting 
for the noise properties of the individual exposures (including the masks, of 
course) . 

The result of the coaddition procedure is then a science frame, plus a 
weight map which contains information about the pixel noise, which is of 
course spatially varying, owing to the masks, CCD gaps, removed cosmic 
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Fig. 10. This figure shows the geometric distortion of the WFI. Plotted is the 
difference of the positions of stars as obtained from a simple translation, and a 
third-order astrometric correction obtained in the process of image reduction. The 
patterns in the two left chips is due to their rotation relative to the other six chips. 
Whereas this effect looks dramatic at first sight, the maximum length of the sticks 
corresponds to about 6 pixels, or l'/2. Given that the WFI covers a field of ~ 33', 
the geometrical distortions are remarkably small - however, they are sufficiently 
large that they have to be taken into account in the coaddition process (source: T. 
Erben & M. Schirmer) 

rays and bad pixels. Fig. 12 shows a typical example of a coadded image and 
its corresponding weight map. 

The quality of the coadded image can be checked in a number of ways. 
Coaddition should not erase information contained in the original exposures 
(except, of course, the variability of sources). This means that the PSF of the 
coadded image should not be larger than the weighted mean of the PSFs of 
the individual frames. Insufficient relative astrometry would lead to a blurring 
of images in the coaddition. Furthermore, the anisotropy of the PSF should 
be similar to the weighted mean of the PSF anisotropies of the individual 
frames; again, insufficient astrometry could induce an artificial anisotropy of 
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Fig. 11. The principle of drizzling in the process of coaddition is shown. The pixel 
grid of each individual exposure is mapped onto an output grid, where the shifts 
and geometric distortions obtained during the astrometric solutions are applied. 
The counts of the input pixel, multiplied by the relative weight of this pixel, are 
then dropped onto the output pixels, according to the relative overlap area, where 
the output pixels can be chosen smaller than the input pixels. The same procedure 
is applied to the weight maps of the individual exposures. If many exposures are 
coadded, the input pixel can also be shrunk before dropping onto the output pixel. 
After processing all individual exposures in this way, a coadded image and a coadded 
weight map is obtained (source: T. Schrabback) 



the PSF in the coaddition (which can be easily visualized, by adding two 
round images with a slight center offset, where a finite ellipticity would be 
induced) . 

Probably, there does not exist the 'best' coadded image from a given set of 
individual exposures. This can be seen by considering a set of exposures with 
fairly different individual seeing. If one is mainly interested in photometric 
properties of rather large galaxies, one would prefer a coaddition which puts 
all the individual exposures together, in order to maximize the total exposure 
time and therefore to minimize the photometric noise of the coadded sources. 
For weak lensing purposes, such a coaddition is certainly not optimal, as 
adding exposures with bad seeing together with those of good seeing creates 
a coadded image with a seeing intermediate between the good and the bad. 
Since seeing is a much more important quantity than depth for the shape 
determination of faint and small galaxy images, it would be better to coadd 
only the images with the good seeing. In this respect, the fact that large 
imaging instruments are operated predominantly in service observing more 
employing queue scheduling is a very valuable asset: data for weak lensing 
studies are then taken only if the seeing is better than a specified limit; in 
this way one has a good chance to get images of homogeneously good seeing 
conditions. 

As a specific example, we show in Fig. 13 the 'deepest wide-field image 
in the Southern sky', targeted towards the Chandra Deep Field South, one 
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Fig. 12. A final coadded frame from a large number of individual exposures with 
the WFI is shown in the upper left panel, with the corresponding weight map at 
the upper right. The latter clearly shows the large-scale inhomogeneity of the chip 
sensitivity and the illumination, together with the different number of exposures 
contributing to various regions in the output image due to dithering and the gaps 
between CCDs. The two lower panels show a blow-up of the central part. Despite 
the highly inhomogeneous weight, the coadded image apparently shows no tracer 
of the gaps, which indicates that a highly accurate relative photometric solution 
was obtained (source: T. Erben & M. Schirmer) 
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of regions in the sky in which all major observatories have agreed to obtain, 
and make publically available, very deep images for a detailed multi-band 
study. For example, the Hubble Ultra Deep Field (Beckwith et al. 2003) is 
located in the CDFS, the deepest Chandra X-ray exposures are taken in this 
field, as well as two ACS@HST mosaic images, one called the GOODS field 
(Great Observatories Origins Deep Survey; cf. Giavalisco & Mobasher 2004), 
the other the GEMS survey (Rix et al. 2004). 

3.4 Image analysis 

The final outcome of the data reduction steps described above is an image 
of the sky, together with a weight map providing the noise properties of the 
image. The next step is the scientific exploitation of this image, which in the 
case of weak lensing includes the identification of sources, and to measure 
their magnitude, size and shape. 

As a first step, individual sources on the image need to be identified, to 
obtain a catalog of sources for which the ellipticities, sizes and magnitudes 
are to be determined later. This can done with by-now standard software, like 
SExtractor (Bertin & Arnouts 1996), or may be part of specialized software 
packages developed specifically for weak lensing, such as IMCAT, developed 
by Nick Kaiser (see below) . Although this first step seems straightforward at 
first glance, it is not: images of sources can be overlapping, the brightness 
distribution of many galaxies (in particular those with active star formation) 
tends to be highly structured, with a collection of bright spots, and therefore 
the software must be taught whether or not these are to be split into different 
sources, or be taken as one (composite) source. This is not only a software 
problem; in many cases, even visual inspection cannot decide whether a given 
light distribution corresponds to one or several sources. The shape and size 
of the images are affected by the point-spread function (PSF), which results 
from the telescope optics, but for ground-based images, is dominated by the 
blurring caused by the atmospheric turbulence; furthermore, the PSF may 
be affected by telescope guiding and the coaddition process described earlier. 

The point-spread function. Atmospheric turbulence and the other effects 
mentioned above smear the image of the sky, according to 



where /(i9) is the brightness profile outside the atmosphere, 7 obs (i9) the ob- 
served brightness profile, and P is the PSF; it describes how point sources 
would appear on the image. To first approximation, the PSF is a bell-shaped 
function; its full width at half maximum (FWHM) is called the 'seeing' of the 
image. At excellent sites, and excellent telescopes, the seeing has a median of 
<~ / /7-~ 0'.'8; exceptionally, images with a seeing of ~ 0'.'5 can be obtained. 




(28) 
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Chandra Deep Field South (CDF-S) 
(MPG/ESO 2.2-m + WFI) 

ESO PR Photo <Ea 03 (10 Ja ■ 2003) ©European Southern Oliftrmimy WLM 

Fig. 13. A multi-color WFI image of the CDFS; the field is slightly larger than one- 
half degree on the side. To obtain this image, about 450 different WFI exposures 
were combined, resulting in a total exposure time of 15.8 hours in B, 15.6 hours 
in V, and 17.8 hours in R. The data were obtained in the frame of three different 
projects - the GOODS project, the public ESO Imaging Survey, and the COMBO- 
17 survey. These data were reduced and coadded by Mischa Schirmer & Thomas 
Erben; more than 2 TB of disk space were needed for the reduction. 
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Recall that typical faint galaxies are considerably smaller than this seeing 
size, hence their appearance is dominated by the PSF. 

The main effect of seeing on image shapes is that it makes an elliptical 
source rounder: a small source with a large ellipticity will nevertheless appear 
as a fairly round image if its size is considerably smaller than the PSF. If not 
properly corrected for, this smearing effect would lead to a serious underes- 
timate of ellipticities, and thus of the shear estimates. Furthermore, the PSF 
is not fully isotropic; small anisotropics can be introduced by guiding errors, 
the coaddition, the telescope optics, bad focusing etc. An anisotropic PSF 
makes round sources elliptical, and therefore mimics a shear. Also here, the 
effect of the PSF anisotropy depends on the image size and is strongest for 
the smallest sources. PSF anisotropics of several percent are typical; hence, 
if not corrected for, its effect can be larger than the shear to be measured. 

The PSF can be measured at the position of stars (point sources) on the 
field; if it is a smooth function of position, it can be fitted by a low-order 
polynomial, which then yields a model for the PSF at all points, in particu- 
lar at every image position, and one can correct for the effects of the PSF. 
A potential problem occurs if the PSF jumps between chips boundaries in 
multi-chip cameras, since then the coaddition produces PSF jumps on the 
coadded frame; this happens in cameras where the chips are not sufficiently 
planar, and thus not in focus simultaneously. For the WFI@ESO/MPG 2.2- 
m, this however is not a problem, but for some other cameras this problem 
exists and is severe. There is an obvious way to deal with that problem, 
namely to coadd data only from the same CCD chip. In this case, the gaps 
between chips cannot be closed in the coadded image, but for most weak 
lensing purposes this is not a very serious issue. In order not to lose too much 
area in this coaddition, the dither pattern, i.e., the pointing differences in the 
individual exposures, should be kept small; however, it should not be smaller 
than, say, 20", since otherwise some pixels may always fall onto a few larger 
galaxies in the field, which then causes problems in constructing a supcrflat. 
Furthermore, small shifts between exposures means that the number of ob- 
jects falling onto different chips in different exposures is small, thus reducing 
the accuracy of the astrometric solution. In any case, the dither strategy shall 
be constructed for each camera individually, taken into account its detailed 
properties. 

3.5 Shape measurements 

Specific software has been developed to deal with the issues mentioned above; 
the one that is most in use currently has been developed by Kaiser et al. (1995; 
hereafter KSB), with substantial additions by Luppino & Kaiser (1997), and 
later modifications by Hockstra et al. (1998). The numerical implementation 
of this method is called IMCAT and is publically available. The basic features 
of this method shall be outlined next. 
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First one notes that the definition (6) of the second-order moments of the 
image brightness is not very practical for applying it to real data. As the 
effective range of integration depends on the surface brightness of the image 
(through the weight function qi) the presence of noise enters the definition 
the Qij in a non-linear fashion. Furthermore, neighboring images can lead 
to very irregularly shaped integration ranges. In addition, this definition is 
hampered by the discreteness of pixels. For these reasons, the definition is 
modified by introducing a weight function q$(6) which depends explicitly on 
the image coordinates, 

_ Sd*e qe {9)i{Q){e i -o i )(o j -o j ) 

Qv ' fd*6q e (0)I(e) ' ^ 

where the size of the weight function qg is adapted to the size of the galaxy 
image (for optimal S/N measurement). One typically chooses qg to be cir- 
cular Gaussian. The image center is defined as before, but also with the 
new weight function qe{0), instead of qi(I). However, with this definition, the 
transformation between image and source brightness moments is no longer 
simple; in particular, the relation (11) between the second-order brightness 
moments of source and image no longer holds. The explicit spatial depen- 
dence of the weight, introduced for very good practical reasons, destroys the 
convenient relations that we derived earlier - welcome to reality. 

In KSB, the anisotropy of the PSF is characterized by its (complex) el- 
lipticity q, measured at the positions of the stars, and fitted by a low-order 
polynomial. Assume that the (reduced) shear g and the PSF anisotropy q are 
small; then they both will have a small effect on the measured ellipticity. Lin- 
earizing these two effects, one can write (employing the Einstein summation 
convention) 

xT = X° a + P^qp + P^90- (30) 

The interpretation of the various terms is found as follows: First consider 
an image in the absence of shear and the case of an isotropic PSF; then 
X° bs = x°; thus, x° is the image ellipticity one would obtain for q = 
and g = 0; it is the source smeared by an isotropic PSF. It is important to 
note that E(x°) = 0, due to the random orientation of sources. The tensor 
P sm describes how the image ellipticity responds to the presence of a PSF 
anisotropy; similarly, the tensor P 9 describes the response of the image el- 
lipticity to shear in the presence of smearing by the seeing disk. Both, P sm 
and P 9 have to be calculated for each image individually; they depend on 
higher-order moments of the brightness distribution and the size of the PSF. 
A full derivation of the explicit equations can be found in Sect. 4.6.2 of BS01. 
Given that (x°) = 0, an estimate of the (reduced) shear is provided by 

e = (P 9 )- 1 (x ohs - P sm q) . (31) 

If the source size is much smaller than the PSF, the magnitude of P 9 can be 
very small, i.e., the correction factor in (31) can be very large. Given that 
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the measured ellipticity x° bs 1S affected by noise, this noise then also gets 
multiplied by a large factor. Therefore, depending on the magnitude of P 9 , 
the error of the shear estimates differ between images; this can be accounted 
for by specifically weighting these estimates when using them for statistical 
purposes (e.g., in the estimate of the mean shear in a given region). Different 
authors use different weighting schemes when applying KSB. Also, the tensors 
P sm and P 9 are expected to depend mainly on the size of the image and 
their signal-to-noise; therefore, it is advantageous to average these tensors 
over images having the same size and S/N, instead of using the individual 
tensor values which are of course also affected by noise. Erben et al. (2001) 
and Bacon ct al. (2001) have tested the KSB scheme on simulated data and in 
particular investigated various schemes for weighting shear estimates and for 
determining the tensors in (30); they concluded that simulated shear values 
can be recovered with a systematic uncertainty of about 10%. 

Maybe by now you are confused - what is 'real ellipticity' of an image, 
independent of weights etc.? Well, this question has no answer, since only 
images with conformal elliptical isophotes have a 'real ellipticity'. By the 
way, not necessarily the one that is the outcome of the KSB procedure. The 
KSB process does not aim toward measuring 'the' ellipticity of any individual 
galaxy image; it tries to measure 'a' ellipticity which, when averaged over a 
random intrinsic orientation of the source, yields an unbiased estimate of the 
reduced shear. 

Given that the shape measurements of faint galaxies and their correc- 
tion for PSF effects is central for weak lensing, several different schemes for 
measuring shear have been developed (e.g., Valdes et al. 1983; Bonnet & 
Mellier 1995; Kuijken 1999; Kaiser 2000; Refregier 2003b; Bernstein & Jarvis 
2002). In the shapelet method of Refregier (2003b; see also Refregier & Ba- 
con 2003), the brightness distribution of galaxy images is expanded in a set 
of basis functions ('shapelets') whose mathematical properties are particu- 
larly convenient. With a corresponding decomposition of the PSF (the shape 
of stars) into these shapelets and their low-order polynomial fit across the 
image, a partial deconvolution of the measured images becomes possible, us- 
ing linear algebraic relations between the shapelet coefficients. The effect of a 
shear on the shapelet coefficients can be calculated, yielding then an estimate 
of the reduced shear. In contrast to the KSB scheme, higher-order brightness 
moments, and not just the quadrupoles, of the images are used for the shear 
estimate. 

These alternative methods for measuring image ellipticities (in the sense 
mentioned above, namely to provide an unbiased estimate of the local reduced 
shear) have not been tested yet to the same extent as is true for the KSB 
method. Before they become a standard in the field of weak lensing, several 
groups need to independently apply these techniques to real and synthetic 
data sets to evaluate their strengths and weaknesses. In this regard, one 
needs to note that weak lensing has, until recently, been regarded by many 
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researchers as a field where the observational results are difficult to 'believe' 
(and sure, not all colleagues have given up this view, yet). The difficulty 
to display the directly measured quantities graphically so that they can be 
directly 'seen' makes it difficult to convince others about the reliability of the 
measurements. The fact that the way from the coaddcd imaging data to the 
final result is, except for the researchers who actually do the analysis, close 
to a black box with hardly any opportunity to display intermediate results 
(which would provide others with a quality check) implies that the methods 
employed should be standardized and well checked. 

Surprisingly enough, there are very few (published) attempts where the 
same data set is analyzed by several groups independently, and intermediate 
and final results being compared. Kleinheinrich (2003) in her dissertation 
has taken several subsets of the data that led to the deep image shown in 
Fig. 13 and compared the individual image ellipticities between the various 
subsets. If the subsets had comparable seeing, the measured ellipticities could 
be fairly well reproduced, with an rms difference of about 0.15, which is small 
compared to the dispersion of the image ellipticities cr £ ~ 0.35. Hence, these 
differences, which presumably are due to the different noise realizations on 
the different images, are small compared to the 'shape noise' coming from 
the finite intrinsic ellipticities of galaxies. If the subsets had fairly different 
seeing, the smearing correction turns out to lead to a systematic bias in the 
measured ellipticities. From the size of this bias, the conclusions obtained 
from the simulations are confirmed - measuring a shear with better that 
~ 10% accuracy will be difficult with the KSB method, where the main 
problem lies in the smearing correction. 

Shear observations from space. We conclude this section with a few com- 
ments on weak lensing observations from space. Since the PSF is the largest 
problem in shear measurements, one might be tempted to use observations 
from space which are not affected by the atmosphere. At present, the Hubble 
Space Telescope (HST) is the only spacecraft that can be considered for this 
purpose. Weak lensing observations have been carried out using two of its 
instruments, WFPC2 and STIS. The former has a field-of-view of about 5 
arcmin 2 , whereas STIS has a field of 51". These small fields imply that the 
number of stars that can be found on any given exposure at high galactic lat- 
itude is very small, in fact typically zero for STIS. Therefore, the PSF cannot 
be measured from these exposures themselves. Given that an instrument in 
space is expected to be much more stable than one on the ground, one might 
expect that the PSF is stable in time; then, it can be investigated by analyz- 
ing exposures which contain many stars (e.g., from a star cluster). In fact, 
Hockstra et al. (1998) and Hammcrle et al. (2002) have shown that the PSFs 
of WFPC2 and STIS are approximately constant in time. The situation is 
improved with the new camera ACS onboard HST, where the field size of 
<~ 3'.4 is large enough to contain about a dozen stars even for high galactic 
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latitude, and where some control over the PSF behavior on individual images 
is obtained. We shall discuss the PSF stability of the ACS in Sect. 7.3 below. 

The PSF of a diffraction-limited telescope is much more complex than that 
of the seeing-dominated one for ground-based observations. The assumption 
underlying the KSB method, namely that the PSF can be described by a 
axi-symmetric function convolved with a small anisotropic kernel, is strongly 
violated for the HST PSF; it is therefore less obvious how well the shear 
measurements with the KSB method work in space. In addition, the HST 
PSF in not well sampled with the current imaging instruments, even though 
STIS and ACS have a pixel scale of C'05. The number density of cosmic 
rays is much larger in space, so their removal can be more cumbersome than 
for ground-based observations. The intense particle bombardment also leads 
to aging of the CCD, which lose their sensitivity and attain charge-transfer 
efficiency problems. Despite these potential problems, a number of highly 
interesting weak lensing results obtained with the HST have been reported, in 
particular on clusters, and we shall discuss some of them in later sections. The 
new Advanced Camera for Surveys (ACS) on-board HST has a considerably 
larger field-of-view than previous instruments and will most likely become a 
highly valuable tool for weak lensing studies. 

4 Clusters of galaxies: Introduction, and strong lensing 
4.1 Introduction 

Galaxies are not distributed randomly, but they cluster together, forming 
groups and clusters of galaxies. Those can be identified as overdensities of 
galaxies projected onto the sky, and this has of course been the original 
method for the detection of clusters, e.g., leading to the famous and still 
heavily used Abell (1958) catalog and its later Southern extension (Abell et 
al. 1989; ACO). Only later - with the exception of Zwicky's early insight 
9n 1933 that the Coma cluster must contain a lot of missing mass - it was 
realized that the visible galaxies are but a minor contribution to the clusters 
since they are dominated by dark matter. From X-ray observations we know 
that clusters contain a very hot intracluster gas which emits via free-free and 
atomic line radiation. Many galaxies arc members of a cluster or a group; 
indeed, the Milky Way is one of them, being one of two luminous galaxies 
of the Local Group (the other one is M31, the Andromeda galaxy), of which 
~ 35 member galaxies are known, most of them dwarfs. 

In the first part of this section we shall describe general properties of 
galaxy clusters, in particular methods to determine their masses, before turn- 
ing to their strong lensing properties, such as show up in the spectacular giant 
luminous arcs. Very useful reviews on clusters of galaxies are from Sarazin 
(1986) and in a recent proceedings volume (Mulchaey et al. 2004). 
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4.2 General properties of clusters 

Clusters of galaxies contain tens to hundreds of bright galaxies; their galaxy 
population is dominated by early- type galaxies (E's and SO's), i.e. galaxies 
without active star formation. Often a very massive cD galaxy is located at 
their center; these galaxies differ from normal ellipticals in that they have a 
much more extended brightness profile - they are the largest galaxies. The 
morphology of clusters as seen in their distribution of galaxies can vary a 
lot, from regular, compact clusters (often dominated by a central cD galaxy) 
to a bimodal distribution, or highly irregular morphologies with strong sub- 
structure. Since clusters are at the top of the mass scale of virialized objects, 
the hierarchical merging scenario of structure growth predicts that many of 
them have formed only recently through the merging of two or more lower- 
mass sub-clusters, and so the irregular morphology just indicates that this 
happened. 

X-ray observations reveal the presence of a hot (several keV) intracluster 
medium (ICM) which is highly enriched in heavy elements; hence, this gas 
has been processed through star-formation cycles in galaxies. The mass of the 
ICM surpasses that of the baryons in the cluster galaxies; the mass balance in 
clusters is approximately as follows: stars in cluster galaxies contribute ~ 3% 
of the total mass, the ICM another ~ 15%, and the rest (<; 80%) is dark 
matter. Hence, clusters are dominated by dark matter; as discussed below 
(Sect. 4.3), the mass of clusters can be determined with three vastly different 
methods which overall yield consistent results, leadding to the aforementioned 
mass ratio. 

We shall now quote a few characteristic values which apply to rich, massive 
clusters. Their virial radius, i.e., the radius inside of which the mass distri- 
bution is in approximate virial equilibrium (or the radius inside of which the 
mean mass density of clusters is <~ 200 times the critical density of the Uni- 
verse - cf. Sect. 4.5 of IN) is r v j r <~ 1.5 h^ 1 Mpc. A typical value for the one- 
dimensional velocity dispersion of the member galaxies is a v ~ 1000 km/s. 
In equilibrium, this equals the thermal velocity of the ICM, corresponding 
to a temperature of T <~ 10 7 5 K <~ 3 keV. The mass of massive clusters 
within the virial radius (i.e., the virial mass ) is ~ 1O 15 M . The mass-to- light 
ratio of clusters (as measured from the B-band luminosity) is typically of or- 
der (M/L) ~ 300/i _1 (Mq/Lq). Of course, the much more numerous typical 
clusters have smaller masses (and temperatures). 

Cosmological interest for clusters. Clusters are the most massive bound 
and virialized structures in the Universe; this, together with the (related) 
fact that their dynamical time scale (e.g., the crossing time ~ r v i r /<7„) is 
not much smaller than the Hubble time H^ 1 - so that they retain a 'mem- 
ory' of their formation - render them of particular interest for cosmologists. 
The evolution of their abundance, i.e., their comoving number density as a 
function of mass and redshift, is an important probe for cosmological models 
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and traces the growth of structure; massive clusters are expected to be much 
rarer at high rcdshift than today. Their present-day abundance provides one 
of the measures for the normalization of the power spectrum of cosmological 
density fluctuations. Furthermore, they form (highly biased) signposts of the 
dark matter distribution in the Universe, so their spatial distribution traces 
the large-scale mass distribution in the Universe. Clusters act as laboratories 
for studying the evolution of galaxies and baryons in the Universe. Since the 
galaxy number density is highest in clusters, mergers of their member galax- 
ies and, more importantly, other interactions between them occur frequently. 
Therefore, the evolution of galaxies with rcdshift is most easily studied in 
clusters. For example, the Butcher-Oemler effect (the fact that the fraction 
of blue galaxies in clusters is larger at higher redshifts than today) is a clear 
sign of galaxy evolution which indicates that star formation in galaxies is 
suppressed once they have become cluster members. More generally, there 
exists a density-morphology relation for galaxies, with an increasing fraction 
of early-types with increasing spatial number density, with clusters being 
on the extreme for the latter. Finally, clusters were (arguably) the first ob- 
jects for which the presence of dark matter has been concluded (by Zwicky 
in 1933). Since they are so large, and present the gravitational collapse of 
a region in space with initial comoving radius of <~ 8/i _1 Mpc, one expects 
that their mixture of baryonic and dark matter is characteristic for the mean 
mass fraction in the Universe (White et al. 1993). With the baryon fraction 
of ~ 15% mentioned above, and the density parameter in baryons deter- 
mined from big-bang nucleosynthesis in connection to the determination of 
the deuterium abundance in Lya QSO absorption systems, J?b ~ 0.02/i~ 2 , 
one obtains a density parameter for matter of S7 m ~ 0.3, in agreement with 
results from other methods, most noticibly from the recent WMAP CMB 
measurements (e.g., Spergel et al. 2003). 

4.3 The mass of galaxy clusters 

Cosmologists can predict the abundance of clusters as a function of their 
mass (e.g., using numerical simulations); however, the mass of a cluster is 
not directly observable, but only its luminosity, or the temperature of the X- 
ray emitting intra-cluster medium. Therefore, in order to compare observed 
clusters with the cosmological predictions, one needs a way to determine their 
masses. Three principal methods for determining the mass of galaxy clusters 
are in use: 

• Assuming virial equilibrium, the observed velocity distribution of galaxies 
in clusters can be converted into a mass estimate, employing the virial 
theorem; this method typically requires assumptions about the statistical 
distribution of the anisotropy of the galaxy orbits. 

• The hot intra-cluster gas, as visible through its Bremsstrahlung in X-rays, 
traces the gravitational potential of the cluster. Under certain assump- 
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tions (see below), the mass profile can be constructed from the X-ray 
emission. 

• Weak and strong gravitational lensing probes the projected mass profile 
of clusters, with strong lensing confined to the central regions of clusters, 
whereas weak lensing can yield mass measurements for larger radii. 

All three methods are complementary; lensing yields the line-of-sight pro- 
jected density of clusters, in contrast to the other two methods which probe 
the mass inside spheres. On the other hand, those rely on equilibrium (and 
symmetry) conditions; e.g., the virial method assumes virial equilibrium (that 
the cluster is dynamically relaxed) and the degree of anisotropy of the galaxy 
orbit distribution. 

Dynamical mass estimates. Estimating the mass of clusters based on the 
virial theorem, 

2-Elcin + ^p°t = i (32) 
has been the traditional method, employed by Zwicky in 1933 to find strong 
hints for the presence of dark matter in the Coma cluster. The specific kinetic 
energy of a galaxy is i> 2 /2, whereas the potential energy is determined by the 
cluster mass profile, which can thus be determined using (32). One should 
note that only the line-of-sight component of the galaxy velocities can be 
measured; hence, in order to derive the specific kinetic energy of galaxies, 
one needs to make an assumption on the distribution of orbit anisotropies in 
the cluster potential. Assuming an isotropic distribution of orbits, the l.o.s. 
velocity distribution can then be related to the 3-D velocity dispersion, which 
in turn can be transformed into a mass estimate if spherical symmetry is 
assumed. This method requires many redshifts for an accurate mass estimate, 
which are available only for a few clusters. However, a revival of this method 
is expected and already seen by now, owing to the new high-multiplex optical 
spectrographs. 

X-ray mass determination of clusters. The intracluster gas emits via 
Bremsstrahlung; the emissivity depends on the gas density and temperature, 
and, at lower T, also on its chemical composition, since at T ^ IkeV the 
line radiation from highly ionized atomic species starts to dominate the total 
emissivity of a hot gas. Investigating the properties of the ICM with X-ray 
observations have revealed a wealth of information on the properties of clus- 
ters (see Sarazin 1986). Assuming that the gas is in hydrostatic equilibrium 
in the potential well of the cluster, the gas pressure P must balance gravity, 
or 

VP = -p g V#, 

where p g is the gas density. In the case of spherical symmetry, this becomes 

_L_ dP _d<£ _ GM(r) 
p g dr dr r 2 
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From the X-ray brightness profile and temperature measurement, M(r), the 
mass inside r, both dark and luminous, can then be determined, 

where pm p is the mean particle mass in the gas. Only for relatively few 
clusters are detailed X-ray brightness and temperature profile measurements 
available. In the absence of a temperature profile measurement, one often as- 
sumes that T does not vary with distance form the cluster center. In this case, 
assuming that the dark matter particles also have an isothermal distribution 
(with velocity traced by the galaxy velocities), one can show that 

p g (r) cx [p tot (r)f ; with /3=^g^- (34) 

Hence, is the ratio between kinetic and thermal energy. The mass profile 
corresponding to the isothcrmality assumption follows from the Lame-Emdcn 
equation which, however, has no closed-form solution. In the King approx- 
imation, the density and X-ray brightness profile (which is obtained by a 
line-of-sight integral at projected distance R from the cluster center over the 
emissivity, which in turn is proportional to the square of the electron density, 
or cx pi, for an isothermal gas) become 
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1+1- 



2 

r 



-3/3/2 



I(R) oc 



1+ [ - 



-3/3/2+1/2 



where r c is the core radius. The observed brightness profile can now be fitted 
with these /3-models, yielding estimates of and r c from which the cluster 
mass follows. Typical values for r c range from 0.1 to 0.3/i _1 Mpc; and — 
Ait ~ 0.65. On the other hand, one can determine from the temperature 
T and the galaxy velocity dispersion using (34), which yields spe c ~ 1- The 
discrepancy between these two estimates of is not well understood and 
probably indicates that one of assumptions underlying this '/3-models' fails 
in many clusters, which is not too surprising (see below). 

The hot ICM loses energy through its thermal radiation; the cooling time 
icooi of the gas, i.e., the ratio between the thermal energy density and the 
X-ray emissivity, is larger than the Hubble time ~ H^ 1 for all but the inner- 
most regions. In the center of clusters, the gas density can be high enough to 
have i C ooi < Hq , so that there the gas can no longer be in hydrostatic equi- 
librium. One expects that the gas flows towards the cluster center, thereby 
being compressed and therefore maintain approximate pressure balance. Such 
'cooling flows' (see, e.g., Fabian 1994) are observed indirectly, through highly 
peaked X-ray emission in cluster centers which indicates a strong increase of 
the gas density; furthermore, these cooling-flow clusters show a decrease of 
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T towards the center. The mass-flow rate in these clusters can be as high as 
lOOM yr _1 or even more, so that the total cooled mass can be larger than 
the baryonic mass of a massive galaxy. However, the fate of the cooled gas is 
unknown. 

New results from Chandra &c XMM. The two X-ray satellites Chandra 
and XMM, launched in 1999, have greatly increased our view of the X-ray 
Universe, and have led to a number of surprising results about clusters. X-ray 
spectroscopy verified the presence of cool gas near the center of cooling-flow 
clusters, but no indication for gas with temperature below ~ 1 keV has been 
seen, whereas the cooling is expected to rapidly proceed to very low temper- 
atures, as the cooling function increases for lower T where atomic transitions 
become increasingly important. Furthermore, the new observations have re- 
vealed that at least the inner regions of clusters often show a considerably 
more complicated structure than implied by hydrostatic equilibrium. In some 
cases, the intracluster medium is obviously affected by a central AGN, which 
produces additional energy and entropy input, which might explain why no 
sub-keV gas has been detected. As the AGN activity of a galaxy may be 
switched on and off, depending on the fueling of the central black hole, even 
in clusters without a currently active AGN such heating might have occurred 
in the recent past, as indicated in some cases by radio relics. Cold fronts with 
very sharp edges (discontinuities in density and temperature, but such that 
P oc pT is approximately constant across the front), and shocks have been 
discovered, most likely showing ongoing or recent merger events. In many 
clusters, the temperature and metalicity appears to be strongly varying func- 
tions of position which invalidates the assumption of isothermality underlying 
the /3-model. Therefore, mass estimates of central parts of clusters from X- 
ray observations require special care, and one needs to revise the simplified 
models used in the pre-Chandra era. In fact, has there ever been the believe 
that the /3-model provides an adequate description of the gas in a cluster, the 
results from Chandra and XMM show that this is unjustified. The physics of 
the intracluster gas appears to be considerably more complicated than that. 

4.4 Luminous arcs 8z multiple images 

Strong lensing effects in cluster show up in the form of giant luminous arcs, 
strongly distorted arclets, and multiple images of background galaxies. Since 
strong lensing only occurs in the central part of clusters, it can be used only 
to probe their inner mass structure. However, strong lensing yields by far 
the most accurate central mass determinations in those cases where several 
strong lensing features can be identified. For a detailed account of strong 
lensing in clusters, the reader is referred to the review by Fort & Mellier 
(1994). 

Furthermore, clusters thus act as a 'natural telescope'; many of the most 
distant galaxies have been found by searching behind clusters, employing the 
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lensing magnification. For example, the recently discovered very high redshift 
galaxies at z w 7 (Kneib et al. 2004) and z = 10 (Pello et al. 2004) were found 
through a search in the direction of the high-magnification region in the clus- 
ters A2218 and A1835, respectively. In the first of these two cases, the multiple 
imaging of the background galaxy provides not only the magnification, but 
also an estimate of the redshift of the source (which is not determined by any 
spectral line), whereas in the latter case, only the implied high magnification 
makes the source visible on deep HST images and allows its spectroscopy, 
yielding a spectral line which most likely is due to Lya. The magnification 
is indeed a very important asset, as can be seen from a simple example: a 
value of n = 5 reduces the observing time for obtaining a spectrum by a 
factor 25 (in the case where the noise is sky background dominated) - which 
is the difference of being doable or not. Recognizing the power of natural 
telescopes, the deepest SCUBA surveys for faint sub-millimeter sources have 
been conducted (e.g., Blain et al. 1999) around clusters with well-constrained 
(from lensing) mass distribution to reach further down the (unlensed) flux 
scale. 

First go: M(< #e)- Giant arcs occur where the distortion (and magnifi- 
cation) is very large, that is near critical curves. To a first approximation, 
assuming a spherical mass distribution, the location of the arc from the clus- 
ter center (which usually is assumed to coincide with the brightest cluster 
galaxy) yields the Einstein radius of the cluster, so that the mass estimate 
(see IN, Eq. 43) can be applied. 

Mie^^iriD^e^) 2 £ CI . (35) 

Therefore, this simple estimate yields the mass inside the arc radius. However, 
this estimate not very accurate, perhaps good to within ~ 30% (Bartclmann 
& Steinmetz 1996). Its reliability depends on the level of asymmetry and 
substructure in the cluster mass distribution. Furthermore, it is likely to 
overestimate the mass in the mean, since arcs preferentially occur along the 
major axis of clusters. Of course, the method is very difficult to apply if the 
center of the cluster is not readily identified or if the cluster is obviously 
bimodal. For these reasons, this simple method for mass estimates is not 
regarded as particularly accurate. 

Detailed modeling. The mass determination in cluster centers becomes 
much more accurate if several arcs and/or multiple images are present, since 
in this case, detailed modeling can be done. This typically proceeds in an 
interactive way: First, multiple images have to be identified (based on their 
colors and/or detailed morphology, as available with HST imaging). Simple 
(plausible) mass models are then assumed, with parameters fixed by matching 
the multiple images, and requiring the distortion at the arc location(s) to be 
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Fig. 14. The galaxy cluster Abell 1689 is the most impressive lensing cluster yet 
found. This image has been taken with the new Advanced Camera for Surveys 
(ACS) onboard HST. Numerous arcs are seen. A simple estimate for the mass of 
the center of the cluster, obtained by identifying the arcs radius with the Einstein 
radius, yields an extremely large equivalent velocity dispersion. The distribution 
of the arcs shown here indicates that such a simple assumption is misleading, and 
more detailed modeling required 
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strong and to have the correct orientation. This model then predicts the 
presence of possible further multiple images; they can be checked for through 
morphology, surface brightness (in particular if HST images of the cluster 
are available) and color. If confirmed, a new, refined model is constructed 
including these new additional strong lensing constraints, which yields further 
strong lensing predictions etc. As is the case for galaxy lensing (see SL), the 
components of the mass models are not arbitrary, but chosen to be physically 
motivated. Typically, as major component a ellipsoidal isothermal or NWF 
distribution is used to describe the overall mass distribution of the cluster. 
Refinements of the mass distribution are introduced as mass components 
centered on bright cluster member galaxies or on subgroups of such galaxies, 
describing massive subhalos which survived a previous merger. Such models 
have predictive power and can be trusted in quite some detail; the accuracy 
of mass estimates in some favorable cases can be as high as a few percent. 




Fig. 15. The lower panel shows the 
critical curves of the cluster A2390 
(cluster redshift z d = 0.231), for 
three different source redshifts of 
z s = 1, 2.5 and 4 (from inner to 
outer). The lens model is based on 

to the detailed HST image shown here. 
Identified are two sets of multiple im- 
ages, shown in the upper two panels, 
which obviously need to be at very 
high redshift. Indeed, spectroscopy 

o shows that they have z s = 4.04 and 
z s = 4.05 (from Pello et al. 1999) 



In fact, these models can be used to predict the redshift of arcs and arclets. 
As an example, we mention the strong lensing analysis of the cluster Abell 
2390 based on HST imaging (Pello et al. 1999). Two pairs of multiple images 
were identified (see Fig. 15) which then implies that the critical curve has to 
pass between the individual components. The location of the critical curves 
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depends, however, on the source redshift. As shown in the figure, the sources 
have to be at a high redshift in order for the corresponding critical curves 
to have the correct location. In fact, spectroscopy placed the two sources at 
z s = 4.04 and z s = 4.05, as predicted by the lens model. 

Since the distortion of a lens also depends on the source redshift, once a 
detailed mass model is available from arcs with known redshifts for at least 
some of them, one can estimate the value of the lens strength oc Dd s /D s 
and thus infer the redshift of arclets. This method has been successfully 
applied to HST observations of clusters (Ebbels et al. 1998). Of course, having 
spectroscopic redshifts of the arcs available increases the calibration of the 
mass models; they are therefore very useful. 

Lens properties from Fourier transforms. Before discussing results from 
these detailed models, a brief technical section shall be placed here, related to 
calculating lens properties of general mass distributions. A general method 
to obtain the lensing quantities of a mass distribution is through Fourier 
transformation. We assume that we have a mass distribution of finite mass; 
this is not a serious restriction even for models with formally infinite total 
mass, because we can truncate them on large scales, thus making the total 
mass finite, without affected any lensing properties at smaller scales. We 
define the Fourier transform k{£) of the surface mass density as 4 

k{£) = { d 2 9 k(0) exp (i£ ■ 9) , (36) 
Jn 2 

and its inverse by 

K ( 9 ) = 7^-T? I d 2 £k(£)exp(-i£-9) . (37) 

Similarly, we define the Fourier transforms of the deflection potential, ip(£), 
of the deflection angle, a(£), and of the complex shear, j(£). Differentiation 
by 6i in real space is replaced by multiplication by —i£i in Fourier space. 
Therefore, the Fourier transform of dtp/dOj is —i£jip(£). Hence, the Poisson 
equation as given in Sect. 2.2 of IN becomes in Fourier space 

-\£\ 2 ip(£) = 2k{£) . (38) 

Thus, for £ ^ 0, the Fourier transform of the potential which satisfies the 
Poisson equation can be readily determined. The £ = mode remains un- 
determined; however, since this mode corresponds to a constant in ip, it is 
unimportant and can be set to zero. Once ip is determined, the Fourier trans- 
form of the deflection angle and the shear follows from their definitions in 

4 We denote the Fourier variable of three-dimensional space as k, that of angular 
position by £. 
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terms of the deflection potential, given in Sect. 2.2 of IN, 

a(£) = -i£ip(£) , (39) 

7W = -(^^+iW)^). (40) 

Thus, in principle, one determines the relevant quantities by Fourier trans- 
forming k, then calculating the Fourier transforms of the potential, deflection, 
and shear, whose real-space counterparts are then obtained from an inverse 
Fourier transform, like in (37). 

Up to now we have not gained anything; the Fourier transforms as defined 
above are two-dimensional integrals, as are the real-space relations between 
deflection angle and shear, and the surface-mass density. However, provided 
n becomes 'small enough' for large values of \0\, the integral in (36) may be 
approximated by one over a finite region in 0-space. This finite integral is 
further approximated as a sum over gridpoints, with a regular grid covering 
the lens plane. Consider a square in the lens plane of side L, and let N be 
the number of gridpoints per dimension, so that Ad = L/N is the size of a 
gridccll. The inverse grid, i.e., the £-grid, has a gridcell of size At — 2-k/L. 
The discrete Fourier transform then uses the values of k on the 0-grid to 
calculate k on the £-grid. The latter, in fact, is then the Fourier transform 
of the periodic continuation of the mass distribution in 0-space. Because of 
this periodic continuation, the deflection angle as calculated from the discrete 
Fourier transform, which is performed by the Fast Fourier Transform (FFT) 
method, is the sum of the input mass distribution, plus all of its periodic 
continuation. Here, finally, is why we have considered the Fourier method: the 
FFT is a very efficient and quick procedure (see, e.g., Press et al. 1992), and 
arguably the best one in cases of mass distributions for which no analytical 
progress can be made. The lensing properties are calculated on a grid; if 
needed, they can be obtained for other points by interpolation. 

Because of the periodic continuation, the mass distribution has to de- 
creases sufficiently quickly for large \6\, or be truncated at large radii. In 
any case, L should be taken sufficiently large to minimize these periodicity 
effects. 

Another point to mention is that a periodic mass distribution, each el- 
ement of which has positive total mass, has an infinite mass, so that the 
deflection potential has to diverge; on the other hand, the deflection poten- 
tial is enforced to be periodic. This apparent contradiction can be resolved 
by noting that the £ = mode of k is not used in the calculation of a and 7. 
Indeed, if ij) and ip are calculated from the above equations, then the resulting 
ip does not satisfy the Poisson equation; the tp resulting from this procedure 
is the one corresponding to n — R, where R is the average of k on the 6>-grid. 
A similar remark is true for the deflection angle. Thus, at the end, one has 
to add a term R \6\ 2 /2 to i[>, and a term RO to a. 
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Since the FFT is very fast, one can choose N and L large, and then con- 
sider only the central part of the 0-grid needed for the actual lens modeling. 

4.5 Results from strong lensing in clusters 

The main results of the strong lensing investigations of clusters can be sum- 
marized as follows: 

• The mass in cluster centers is much more concentrated than predicted 
by (simple) models based on X-ray observations. The latter usually pre- 
dict a relatively large core of the mass distribution. These large cores 
would render clusters sub-critical to lensing, i.e., they would be unable 
to produce giant arcs or multiple images. In fact, when arcs were first 
discovered they came as a big surprise because of these expectations. By 
now we know that the intracluster medium is much more complicated 
than assumed in these '/3-model' fits for the X-ray emission. 

• The mass distribution in the inner region of clusters often shows strong 
substructure, or multiple mass peaks. These are also seen in the galaxy 
distribution of clusters, but with the arcs can be verified to also corre- 
spond to mass peaks (examples of this include the cluster Abell 2218 
where arcs also curve around a secondary concentration of bright galax- 
ies, clearly indicating the presence of a mass concentration, or the obvi- 
ously bimodal cluster A 370). These are easily understood in the frame 
of hierarchical mergers in a CDM model; the merged clusters retain their 
multiple peaks for a dynamical time or even longer, and are therefore not 
in virial equilibrium. 

• The orientation of the (dark) matter appears to follow closely the ori- 
entation of the light in the cD galaxy; this supports the idea that the 
growth of the cD galaxy is related to the cluster as a whole, through 
repeated accretion of lower-mass member galaxies. In that case, the cD 
galaxy 'knows' the orientation of the cluster. 

• There is in general good agreement between lensing and X-ray mass es- 
timates (e.g., Ettori & Lombardi 2003; Donahue et al. 2003) for those 
clusters where a 'cooling flow' indicates that they are in dynamical equi- 
librium, provided the X-ray analysis takes the presence of the cooling 
flow into account (Allen 1998). 

Probably our 'favourate' clusters in which strong lensing effects are investi- 
gated in detail are biased in favor of having strong substructure, as this in- 
creases the lensing cross section for the occurrence of giant arcs (see below). 
Hence, it may be that the most detailed results obtained from strong lensing 
in clusters apply to a class of clusters which are especially selected because 
of their ability to produce spectacular arcs, and thus of their asymmetric 
mass distribution. Therefore, one must be careful in generalizing conclusions 
drawn from the 'arc clusters' to the cluster population as a whole. 
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Discrepancies. There are a few clusters where the lensing results and those 
obtained from analyzing the X-ray observations or cluster dynamics are in 
stong apparent conflict. Two of the most prominent ones shall be mentioned 
here. The cluster A1689 (see Fig. 14) has arcs more than <~ 40" away from 
the cluster center, which would imply a huge mass in this cluster center. 
This high mass is apparently confirmed by the high velocity dispersion of its 
member galaxies, although their distribution in redshift makes it likely that 
the cluster consists of several subcomponents (see Clowe & Schneider 2001 
for a summary of these results). Several weak lensing results of this cluster 
have been published, and they are not all in agreement: whereas Tyson & 
Fischer (1995) from weak shear, and Taylor et al. (1998) and Dye et al. 
(2001) from the magnification method (that will be discussed in the next 
Section) find also a very high mass for this cluster, the weak lensing analysis 
of Clowe & Schneider (2001; see also King et al. 2002b), based on deep wide- 
field imaging data of this cluster, finds a more moderate mass (or equivalent 
velocity dispersion) for this cluster. A new XMM-Newton X-ray observation 
of this cluster (Andersson & Madejski 2004) lends support for the smaller 
mass; in fact, their estimate of the virial mass of the cluster agrees with that 
obtained by Clowe & Schneider (2001). However, the disrepancy with the 
strong lensing mass in the cluster center remains at present; a quantitative 
analysis of the ACS data shown in Fig. 14 will hopefully shed light on this 
issue. 

A second clear example for discrepant results in the cluster CI 0024+17. 
It has a prominent arc system, indicating an Einstein radius of ~ 30", and 
thus a high mass. The X-ray properties of this cluster, however, indicate 
a much smaller mass (Soucail et al. 2000), roughly by a factor of three. 
This discrepancy has been reaffirmed by recent Chandra observations, which 
confirmed this factor-of-three problem (Ota et al. 2004). The resolution of 
this discrepancy has probably been found by Czoske et al. (2001, 2002), who 
performed an extensive spectroscopic survey of cluster galaxies. Their result is 
best interpreted such that CI 0024+17 presents a merger of two clusters along 
our line-of-sight, which implies that the measured velocity dispersion cannot 
be easily turned into a mass, as this system is not in virial cquailibrium, 
and that the X-ray data cannot be converted to a mass either, due to the 
likely strong deviation from spherical symmetry and equilibrium. A wide 
field sparsely sampled HST observation of this cluster (Kneib et al. 2003) 
also indicates the presence of a second mass concentration about 3' away from 
the main peak. As will be mentioned below, clusters undergoing mergers have 
particularly high cross sections for producing arcs (Torri et al. 2004); hence, 
our 'favourites' are most likely selected for these non-equilibrium clusters. 

Arc statistics. The abundance of arcs is expected to be a strong function 
of the cosmological parameters: they not only determine the abundance of 
massive clusters (through the mass function discussed in Sect. 4.5 of IN), but 
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also the degree of relaxation of clusters, which in turn affects their strong 
lensing cross section (Bartelmann et al. 1998). It is therefore interesting to 
consider the expected abundance of arcs as a function of cosmological pa- 
rameters and compare this to the observed abundance. In a series of papers, 
M. Bartelmann and his colleagues have studied the expected giant arc abun- 
dance, using analytical as well as numerical techniques (e.g., Bartelmann & 
Weiss 1994; Bartelmann et al. 1995, 1998, 2002; Meneghetti et al. 2004; see 
also Dalai et al. 2003; Oguri et al. 2003; Wambsganss et al. 2004). Some of 
the findings of these studies can be summarized as follows: 

• The formation of arcs depends very sensitively on the deviation from 
spherical symmetry and the detailed substructure of the mass distri- 
bution in the cluster. For this reason, analytical models which cannot 
describe this substructure with sufficient realism (see Bergmann & Pet- 
rosian 1993) do not provide realiable predictions for the arc statistics 
(in particular, axisymmetric mass models are essentially useless for es- 
timating arc statistics), and one needs to refer to numerical simulations 
of structure formation. Since the substructure and triaxiality plays such 
an important role, these simulations have to be of high spatial and mass 
resolution. 

• The frequency of arcs depends of course on the abundance of clusters, 
which in turn depends on the cosmological model and the fluctuation 
spectrum of the matter, in particular its normalization eg. Furthermore, 
clusters at a given redshift have different mean ages in different cosmo- 
logical models, as the history of structure growth, and thus the merging 
history, depends on i? m and Since the age of a cluster is one of the 
determining parameters for its level of substructure - younger clusters do 
not have had enough time to fully relax - this affects the lensing cross 
section of the clusters for arc formation. In fact, during epochs of merg- 
ers, the arc cross-section can have temporary excursions by large factors. 
Even the same cluster at the same epoch can have arc forming cross sec- 
tions that vary by more than an order-of-magnitude between different 
projection directions of the cluster. For fixed cluster abundance today, 
low-density models form clusters earlier than high-density models. 

• Since the largest contribution of the total cross section for arc formation 
comes from clusters at intermediate redshift (z ~ 0.4), also the equation- 
of-state of the dark energy matters; as shown in Meneghetti et al. (2004), 
what matters is the dark energy density at the epoch of cluster forma- 
tion. In addition, the earlier clusters form, the higher their characteristic 
density, which then makes them more efficient lenses for arc formation. 

Taking these effects together, a low-density open model produces a larger 
number of arcs than a flat low-density model, which in turn has more arcs 
than a high-density model, for a given cluster abundance today. Whereas 
the differences between these models obtained by Meneghetti et al. (2004) 
are smaller than claimed in Bartelmann et al. (1998), they in principle allow 
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constraining the cosmological parameters, provided they can be compared 
with the observed number of arcs. 

Unfortunately, there are only a few systematic studies of clusters with 
regards to their strong lensing contents. Luppino ct al. (1999) report on 8 
giant arcs in their sample of the 38 most massive clusters found in the Einstein 
Medium Sensitivity Survey. Zaritsky & Gonzalez (2003) surveyed clusters in 
the rcdshift range 0.5 ^ z ^ 0.7 over 69deg 2 and found two giant arcs with 
R < 21.5 and a length d\ > 10". Gladders at al. (2003) found 5 arc candidates 
in their Red Cluster Sequence survey of 90deg 2 , all of them being associated 
with high-redshift clusters. In contrast to the claim by Bartelmann et al. 
(1998), these observered arc frequencies can be accounted for in a standard 
/1CDM Universe, as shown by Dalai et al. (2003). There are several differences 
between these two studies, which are based on different assumptions about 
the number density of clusters and the source redshift distribution, which 
Dalai et al. (2003) took from the Hubble Deep Field, whereas Bartelmann et 
al. (1998) assumed all sources having z s = 1. 

The strong dependence on the source redshift distribution has been pointed 
out by Wambsganss et al. (2004). In contrast to the other studies, they in- 
vestigated the arc statistics using ray tracing through a three-dimensional 
mass distribution obtained from cosmological simulations, whereas the other 
studies mentioned considered the lensing effect of individual clusters found 
in these simulations. Although the former approach is more realistic, the as- 
sumption of Wambsganss et al. (2004) that the magnification of a light ray 
is a good measure for the lcngth-to- width ratio of a corresponding arc is cer- 
tainly not justified in detail, as shown in Dalai et al. (2003). The agreement 
of the lensing probability between Wambsganss et al. (2004) and Bartelmann 
et al. (1998) for all z s = 1 is therefore most likely a coincidence. 

There are further difficulties in obtaining realistic predictions for the oc- 
currence of giant arcs that can be compared with observations. First, the 
question of whether an image counts as an arc depends on a combination of 
source size, lens magnification, and seeing. Seeing makes arcs rounder and 
therefore reduces their length-to-width ratio. An impressive demonstration 
of this effect is provided by the magnificent system of arcs in the cluster 
A1689 observed with the ACS onboard the HST, as shown in Fig. 14, com- 
pared to earlier ground-based images of this cluster. Second, several of the 
above-mentioned papers assume the source size to be 6 = 1", whereas many 
arcs observed with HST are essentially unresolved in width, implying much 
smaller source sizes (and accordingly, a much higher sensitivity to seeing ef- 
fects). Third, magnification bias is usually not taken into account in these 
theoretical studies. In fact, accounting properly for the magnification bias is 
quite difficult, as the surveys reporting on arc statistics are not really flux- 
limited. One might argue that they are surface brightness-limited, but even 
if this were true, the surface brightness of an arc coming from a small source 
depends very much on the seeing. 
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Therefore at present, the abundance of arcs seem to be not in conflict 
with a ylCDM model, but more realistic simulations which take the afore- 
mentioned effects into account are certainly needed for a definite conclusion 
on this issue. On the observational side, increasing the number of clusters 
for which high-quality imaging is performed is of great importance, and the 
survey of luminous X-ray clusters imaged either with the ACS@HST or with 
ground-based telescopes during periods of excellent seeing would improve 
the observational situation dramatically. Blank-field surveys, such as they 
are conducted for cosmic shear research (see Sect. 7), could be used for blind 
searches of arcs (that is, not restricted to regions around known clusters). It 
may turn out, however, that the number of 'false positives' is unacceptably 
high, e.g., by misidentification of edge-on spirals, or blends of sources that 
yield apparent images with a high length-to-width ratio. 

Constraints on collisional dark matter. Spcrgcl & Stcinhardt (2000) 
suggested the possibility that dark matter particles arc not only weakly in- 
teracting, but may have a larger elastic scattering cross-section. If this cross- 
section of such self-interacting dark matter is sufficiently large, it may help to 
explain two of the remaining apparent discrepancies between the predictions 
of the Cold Dark Matter model and observations: The slowly rising rotation 
curves of dwarf galaxies (e.g., de Blok et al. 2001) and the substructure of 
galaxy-scale dark matter halos (see Sect. 8 of SL). Self-interacting may soften 
the strength of the central density concentration as compared to the NFW 
profile, and could destroy most of the subclumps. However, there are other 
consequence of such an interaction, in that the shapes of the inner parts of 
dark matter halos tend to be more spherical. Meneghetti et al. (2001) have in- 
vestigated the influence of self interaction of dark matter particles on clusters 
of galaxies, in particular their ability to form giant arcs. From their numerical 
simulations of clusters with varying cross-sections of particles, they showed 
that even a relatively small cross-section is sufficient to reduce the ability 
of clusters to produce giant arcs by an order of magnitude. This is mainly 
due to two effects, the reduced asymmetry of the resulting mass distribution 
and the shallower central density profile. Furthermore, self-interactions de- 
stroy the ability of clusters to form radial arcs. Therefore, the 'desired' effect 
of self-interaction - to smooth the mass distribution of galaxies - has the 
same consequence for clusters, and can therefore probably be ruled out as 
a possible mechanism to cure the aforementioned apparent problems of the 
CDM model. From combining X-ray and lensing data of the cluster 0657—56, 
Markevitch et al. (2004) obtained upper limits on the self-interaction cross 
section of dark matter. 

Do clusters follow the universal NFW profile? The CDM paradigm 
of structure formation predict a universal density profile of dark matter ha- 
los. One might therefore investigate whether the strong lensing properties of 
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clusters are compatible with this mass profile. Of particular value for such 
an investigation are clusters which contain several strong lensing features, 
and in particular a radial arc, as it probes the inner critical curve of the 
cluster. Sand et al. (2004; see also Sand et al. 2002) claim from a sample 
of three clusters with radial arcs, that the slope of the inner mass profile 
must be considerably flatter than predicted by the NFW model. However, 
this conclusion is derived under the assumption of an axially-symmetric lens 
model. As is true for strong lensing by galaxies (see SL), axisymmetric mass 
model are not generic, and therefore conclusions derived from them are prone 
to the systematic of the symmetry assumption. That was demonstrated by 
Bartclmann & Meneghetti (2004) who showed that, as expected, the conclu- 
sion about the inner slope changes radically once a finite cllipticity of the 
mass distribution is allowed for, removing the apparent discrepancy with the 
predictions from CDM models. 

Cosmological parameters from strong lensing systems. The lens strength, 
at given physical surface mass density S, depends on the redshifts of lens and 
source, as well as on the geometry of the Universe which enters the distance- 
redshift relation. Therefore, it has been suggested that a cluster which con- 
tains a large number of strong lensing features can be used to constrain cos- 
mological parameters, provided the sources of the arcs and multiple image 
systems cover a large range of redshifts (Link & Pierce 1998). Simulations of 
this effect, using realistic cluster models, confirmed that such purely geometri- 
cal constraints can in principle be derived (Golse et al. 2002). One of the best 
studied strong-lensing cluster up to now is A2218, for which four multiple- 
image systems with measured (spectroscopic) redshift have been identified 
which allows very tight constraints on the mass distribution in this cluster. 
Soucail et al. (2004) applied the aforementioned method to this cluster and 
obtained first constraints on the density parameter f2 m , assuming a flat cos- 
mological model. This work can be viewed as a proof of concept; the new ACS 
camera onboard HST will allow the identification of even richer strong lensing 
systems in clusters, of which the one in A1689 (see Fig. 14) is a particularly 
impressive example. 

5 Mass reconstructions from weak lensing 

Whereas strong lensing probes the mass distribution in the inner part of 
clusters, weak lensing can be used to study the mass distribution at much 
larger angular separations from the cluster center. In fact, as we shall see, 
weak lensing can provide a parameter-free reconstruction of the projected 
two-dimensional mass distribution in clusters - and hence offers the prospect 
of mapping the dark matter distribution of clusters directly. This discovery 
(Kaiser & Squires 1993) can be viewed to mark the beginning of quantita- 
tive weak lensing research. But even before this discovery, weak lensing by 
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clusters has been observed in a number of cases. Fort et al. (1988) found 
that in addition to the giant arc in A 370, there are a number of images 
stretched in the direction tangent to the center of the cluster, but with much 
less spectacular axis ratios than the giant arc in this cluster; they termed 
these new features 'arclets'. Tyson et al. (1990) found a statistically signifi- 
cant tangential alignment of faint galaxy images relative to the center of the 
clusters A 1689 and CI 1409+52, and obtained a mass profile from these lens 
distortion maps. Comparison with numerical simulations yielded an estimate 
of the cluster velocity dispersion, assuming an isothermal sphere profile. 

In this section we consider the parameter-free mass reconstruction tech- 
nique, first the original Kaiser & Squires method, and then a number of 
improvements of this method. We then turn to the magnification effects; the 
change of the number density of background sources, as predicted from (26), 
can be turned into a local estimate of the surface mass density, and this 
method has been employed in a number of clusters. Next we shall consider 
inverse methods for the reconstruction of the mass distribution, which on the 
one hand are more difficult to apply than the 'direct' methods, but on the 
other hand are expected to yield more satisfactory results. Whereas the two- 
dimensional maps yield a good visual impression on the mass distribution in 
clusters, it is hard to extract quantitative information from them. In order 
to get quantities that describe the mass and that can be compared between 
clusters, often parameterized mass models are more useful, which are con- 
sidered next. Finally, we consider aperture mass measures, which have been 
introduced originally to obtain a mass quantity that is unaffected by the 
mass-sheet degeneracy, but as will be shown, has a number of other useful 
features. In particular, employing the aperture mass, one can device a method 
to systematically search for mass concentrations on cluster-mass scales, using 
their shear properties only, i.e. without referring to their luminous properties. 

5.1 The Kaiser Squires inversion 

Weak lensing yields an estimate of the local (reduced) shear, as discussed in 
Sect. 2.2. Here we shall discuss how to derive the surface mass density from 
a measurement of the (reduced) shear. Recalling eq. (IN-26), the relation 
between shear and surface mass density is 



Hence, the complex shear 7 is a convolution of k with the kernel T> 1 or, in 
other words, V describes the shear generated by a point mass. This relation 
can be inverted: in Fourier space this convolution becomes a multiplication, 
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which can be inverted to yield 

k{£) = n~ 1 */(£)!>*(£) for £^0, (42) 
where the Fourier transform of D is 5 

m=* {t '-%^ A) (43) 

note that this implies that T>(£)T>*(£) = ir 2 , which has been used in obtaining 
(42). It is obvious that V is undefined for £ = 0, which has been indicated in 
the foregoing equations. Fourier back-transformation of (42) then yields 

k(0)-k = - I d 2 0'V*(O -0') 7(6>') 

= -/ d 2 6'lle\V*(d-e')j(0')] • (44) 

Note that the constant k occurs since the £ = 0-modc is undetermined. 
Physically, this is related to the fact that a uniform surface mass density 
yields no shear. Furthermore, it is obvious (physically, though not so easily 
seen mathematically) that k must be real; for this reason, the imaginary part 
of the integral should be zero, and taking the real-part only [as in the second 
line of (44)] makes no difference. However, in practice this is different, since 
noisy data, when inserted into the inversion formula, will produce a non-zero 
imaginary part. What (44) shows is that if 7 can be measured, n can be 
determined. 

Before looking at this in more detail, we briefly mention some difficulties 
with the inversion formula as given above: 

• Since 7 can at best be estimated at discrete points (galaxy images), 
smoothing is required. One might be tempted to replace the integral 
in (44) by a discrete sum over galaxy positions, but as shown by Kaiser 
& Squires (1993), the resulting mass density estimator has infinite noise 
(due to the 6>~ 2 -behavior of the kernel T>). 

• It is not the shear 7, but the reduced shear g that can be determined 
from the galaxy ellipticities; hence, one needs to obtain a mass density 
estimator in terms of g. In the case of 'weak' weak lensing, i.e., where 
k«1 and I7I <C 1, then 7 s=a g. 

• The integral in (44) extends over R 2 , whereas data are available only on a 
finite field; therefore, it needs to be seen whether modifications allow the 
construction of an estimator for the surface mass density from finite-field 
shear data. 



5 The form of T> can be obtained most easily by using the relations between the 
surface mass density and the shear components in terms of the deflection potential 
tp, given in (IN-18). Fourier transforming those immediately yields k = — |£| 2 V>/2, 
71 = — (if — i?|)i/>/2, 72 = — iiizip- Eliminating ip from the foregoing relations, 
the expression for V is obtained. 
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• To get absolute values for the surface mass density, the additive constant 
kq is of course a nuisance. As will be explained soon, this indeed is the 
largest problem in mass reconstructions, and is the mass-sheet degeneracy 
discussed in Sect. 2.5 of IN. 



5.2 Improvements and generalizations 

Smoothing. Smoothing of data is needed to get a shear field from discrete 
data points. Consider first the case that we transform (44) into a sum over 
galaxy images (ignoring the constant k for a moment, and also assuming 
the weak lensing case, n <C 1, so that the expectation value of e is the shear 

7), 

*c(0) = — [D(d - 9i) a] , (45) 



K d 



where the sum extends over all galaxy images at positions Oi and complex 
cllipticity ej, and n is the number density of background galaxies. As shown 
by Kaiser & Squires (1993), the variance of this estimator for n diverges. 
However, one can smooth this estimator, using a weight function W{A9) 
(assumed to be normalized to unity), to obtain 

KWoothW = J dV w(\e - e'\) K disc (o') , (46) 

which now has a finite variance. One might expect that, since (i) smoothing 
can be represented by a convolution, (ii) the relation between n and 7 is a 
convolution, and (iii) convolution operations are transitive, it does not matter 
whether the shear field is smoothed first and inserted into (44), or one uses 
(46) directly. This statement is true if the smoothing of the shear is performed 
as 

7smooth;i(0) = -Y j W(\0-6 i \)e i . (47) 

If this expression is inserted into (44), one indeed recovers the estimate (46). 
However, this is not a particularly good method for smoothing, as can be 
seen as follows: the background galaxy positions will at least have Poisson 
noise; in fact, since the angular correlation function even of faint galaxies is 
non-zero, local number density fluctuations will be larger than predicted from 
a Poisson distribution. However, in the estimator (45) and in the smoothing 
procedure (47), these local variations of the number density are not taken 
into account. A much better way (Seitz & Schneider 1995) to smooth the 
shear is given by 



7smooth;2 (0) 



^2W(\0-0i\)ei, (48) 
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which takes these local number density fluctuations into account. Lombardi 
& Schneider (2001) have shown that the expectation value of the smoothed 
shear estimate (48) is not exactly the shear smoothed by the kernel W, but 
the deviation (i.e., the bias) is very small provided the effective number of 
galaxy images inside the smoothing function W is substantially larger than 
unity, which will always be the case for realistic applications. Lombardi & 
Schneider (2002) then have demonstrated that the variance of (48) is indeed 
substantially reduced compared to that of (47), in agreement with the finding 
of Seitz & Schneider (1995). 

When smoothed with a Gaussian kernel of angular scale S , the covariance 
of the resulting mass map is finite, and given by (Lombardi & Bertin 1998; 
van Waerbeke 2000) 

cov( K w,^o)=4^ exp (-^F) • (49) 

Thus, the larger the smoothing scale, the less noisy is the corresponding 
mass map; on the other hand, the more are features washed out. Choosing 
the appropriate smoothing scale is not easy; we shall come back to this issue 
in Sect. 5.3 below. 

The non-linear case, g y£ 7. Noting that the reduced shear g = 7/(1 — 
k) can be estimated from the ellipticity of images (assuming that we avoid 
the potentially critical inner region of the cluster, where \g\ > 1; indeed, 
this ilso be taken into account, at the price of somewhat increased 

complexity), one can write: 

k(0)-/so = - / dV [1-k(0')] TZe[D*{0-0')g(0')] ; (50) 

this integral equation for k can be solved by iteration, and it converges quickly 
(Seitz & Schneider 1995). Note that in this case, the undetermined constant 
Ko no longer corresponds to adding a uniform mass sheet. What the arbitrary 
value of kq corresponds to can be seen as follows: The transformation 

k{6) -» k'{0) = \k(6) + (1 - A) or 

[1-k'(0)] =A[1-k(0)] (51) 

changes the shear 7^7' = A7, and thus leaves g invariant; this is the 
mass-sheet degeneracy! It can be broken if magnification information can be 
obtained, since A—* A' = XA, so that 

\i — > // = . 

Magnification information can be obtained from the number counts of images 
(Broadhurst et al. 1995), owing to the magnification bias, provided the un- 
lensed number density is sufficiently well known. In principle, the mass sheet 
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degeneracy can also be broken if redshift information of the source galaxies 
is available and if the sources are widely distributed in redshift; this can be 
seen as follows: let 

**>- ^£/n ■■<*-*> (52 » 

(H being the Hcaviside step function) be the ratio of the lens strength of a 
source at z s to that of a fiducial source at infinite redshift (see Fig. 16); then, 
if Kqo and denote the surface mass density and shear for such a fiducial 
source, the reduced shear for a source at z s is 

(53) 



1 - Zn 



and there is no global transformation of that leaves g invariant for sources 
at all redshifts, showing the validity of the above statement. However, even 
in this case the mass-sheet degeneracy is only mildly broken (see Bradac 
et al. 2004). In particular, only those regions in the cluster where the non- 
linearity (i.e., the difference between 7 and g) is noticibly can contribute to 
the degeneracy breaking, that is, the region near the critical curves where 
\g\ ~ 1. 



N 




Fig. 16. The redshift weight 
function Z(z B ), defined in 
(52), for three different val- 
ues of the lens redshift z A = 
0.2, 0.5, and 0.8, and three 
different geometries of the 
Universe, as indicated in the 
labels (here, J? m is denoted 
as J?o). Asymptotically for 
z s — > 00, all curves tend to 
Z — 1 (from Bartclmann & 
Schneider 2001) 



In the non-linear case (7 7^ g) the reduced shear needs to be obtained 
from smoothing the galaxy ellipticities in the first place. Since the relation 
between g and n is non-linear, the 'transitivity of convolutions' no longer 
applies; one thus cannot start from a discretization of an integral over im- 
age ellipticities and smooth the resulting mass map later. We also note that 
the accuracy with which the (reduced) shear is estimated can be improved 
provided redshift estimates of individual source galaxies are available (see 
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Fig. 17). In particular for high-rcdshift clusters, redshift information on indi- 
vidual source galaxies becomes highly valuable. This can be understood by 
considering a high-rcdshift lens, where an appreciable fraction of faint 'source' 
galaxies are located in front of the lens, and thus do not contribute to the 
lensing signal. However, they do contribute to the noise of the measurement. 
Redshift information allows the elimination of these foreground galaxies in 
the shear estimate and thus the reduction of noise. 
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Fig. 17. The fractional gain in accuracy of the shear estimate when using redshift 
information of individual source galaxies, relative to the case where only the redshift 
distribution of the population is known, plotted as a function of the lens redshift. 
It is assumed that the sources have a broad redshift distribution, with a mean of 
(zs) = 0.9 (solid and dotted curves) or (z s ) = 1.5 (short-dashed and long-dashed 
curves). The gain of accuracy also depends on the lens strength; the dotted and 
long-dashed curves assume local lens parameters of 7 whereas the 

solid and short-dashed curves assume only very weak lensing, here approximated 
by 700 = = Km . One sees that the gain is dramatic once the lens redshift becomes 
comparable to the mean redshift of the source galaxies and is therefore of great 
importance for high-redshift clusters (from Bartelmann & Schneider 2001) 



Finite-field mass reconstruction. In order to obtain a mass map from a 
finite data field, one starts from the relation (Kaiser 1995) 

v *=te-sH w - <54 » 

which is a local relation between shear and surface mass density; it can easily 
be derived from the definitions of k and 7 in terms of ip^j. A similar relation 
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can be obtained in terms of reduced shear, 

VK(6) = - - 1 2 f 1 - 91 ' 92 ) ( gil+92 A^u g (9), (55) 

l-fll-52 V -f>2 1+51 J \92.1-9l,2j 

where 

K(0) = ln[l - k(0)] (56) 

is a non-linear function of k. Based on these local relations, finite-field inver- 
sion relations can be derived, and several of them appeared in the literature 
right after the foregoing equations have been published. For example, it is 
possible to obtain finite-field mass maps from line integrations (Schneider 
1995; for other methods, see Squires & Kaiser 1996). Of all these finite-field 
methods, one can be identified as optimal, by the following reasoning: in the 
case of noise-free data, the imaginary part of (44) should vanish. Since one 
is always dealing with noisy data (at least coming from the finite intrinsic 
cllipticity distribution of the sources), in real life the imaginary part of (44) 
will not be zero. But being solely a noise component, one can choose that 
finite-field inversion which yields a zero imaginary component when averaged 
over the data field (Seitz & Schneider 1996). One way of deriving this mass 
map is obtained by a further differentiation of (54); this then yields a von 
Neumann boundary-value problem on the data field U (Seitz & Schneider 
2001), 

V 2 k = V • u 7 with n • Vk = n • u 7 on dU , (57) 

where n is the outward-directed normal on the boundary dU oiU. The anal- 
ogous equation holds for K in terms of g and u g , 

V 2 K = V • u 3 with n • VK = n • u 3 on dU . (58) 

Note that (57) determines the solution k only up to an additive constant, 
and (58) determines K only up to an additive constant, i.e., (1 — n) up 
to a multiplicative factor. Hence, in both cases we recover the mass-sheet 
degeneracies for the linear and non-linear case, respectively. The numerical 
solution of these equations is fast, using overrelaxation (see Press et al. 1992). 
In fact, the foregoing formulation of the problem is equivalent (Lombardi & 
Bcrtin 1998) to the minimization of the action 

A= f d 2 |Vk(0)-u 7 (0)| 2 , (59) 
Ju 

from which the von Neumann problem can be derived as the Euler equation 
of the variational principle 6 A = 0. Furthermore, Lombardi & Bcrtin (1998) 
have shown that the solution of (57) is 'optimal', in that for this estimator 
the variance of k is minimized. 

Since (57) provides a linear relation between the shear and the surface 
mass density, one expects that it can also be written in the form 

k(6)= / d 2 6>'H(6>;6>')-u 7 (0') , (60) 
Ju 
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where the vector field H(0; 6') is the Green's function of the von Neumann 
problem (57). Accordingly, 



Seitz & Schneider (1996) gave explicit expression for H in the case of a 
circular and rectangular data field. 

One might ask how important the changes in the resulting mass maps are 
compared to the Kaiser-Squires formula applied to a finite data field. For 
that we note that applying (44) or (50) to a finite data field is equivalent 
to setting the shear outside the data field to zero. Hence, the resulting mass 
distribution will be such as to yield a zero shear outside the data field, despite 
the fact that we have no indication from data that the shear indeed is zero 
there. This induces features in the mass map, in form of a pillow- like overall 
mass distribution. The amplitude of this feature depends on the strength of 
the lens, its location inside the data field, and in particular the size of the 
data field. Whereas for large data fields this amplitude is small compared to 
the noise amplitude of the mass map, it is nevertheless a systematic that can 
easily be avoided, and should be avoided, by using the finite-field inversions, 
which cause hardly any additional technical problems. 

Various tests have been conducted in the literature as to the accuracy of 
the various inversions. For those, one generates artificial shear data from a 
known mass distribution, and compares the mass maps reconstructed with 
the various methods with the original (e.g., Seitz & Schneider 1996, 2001; 
Squires & Kaiser 1996). One of the surprising results of such comparisons 
is that in some cases, the Kaiser & Squires original reconstruction faired 
better than the explicit finite- field inversions, although it is known to yield 
systematics. The explanation for this apparent paradox is, however, easy: the 
mass models used in these test consisted of one or more localized mass peaks 
well inside the data field, so the shear outside the data field is very small. 
Noting that the KS formula applied to a finite data field is equivalent to 
setting 7 = outside the data field, this methods provides 'information' to 
the reconstruction process which is not really there, but for the mass models 
used in the numerical tests is in fact close to the truth. Of course, by adding 
this nearly correct 'information' to the mass reconstruction, the noise can 
be lowered relative to the finite-field reconstructions where no assumptions 
about the shear field outside the data field is made. 

Constraints on the geometry of the Universe from weak lensing 
mass reconstructions. The strength of the lensing signal depends, for a 
given lens redshift, on the rcdshift of the sources, through the function Z(z s ) 
(52). Suppose that the surface mass density of a cluster was well known, and 
that the redshifts of background sources can be determined. Then, by com- 
paring the measured shear signal from sources at a given redshift z s with the 




(61) 
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one expected from the mass distribution, the value of Z(z s ) can be deter- 
mined. Since Z(z) depends on the geometry of the Universe, parameterized 
through j? m and Qa, these cosmological parameters can in principle be de- 
termined. A similar strategy for strong lensing clusters was described at the 
end of Sect. 4. 

Of course, the surface mass density of the cluster cannot assumed to be 
known, but needs to be reconstructed from the weak lensing data itself. Con- 
sider for a moment only the amplitude of the surface mass density, assuming 
that its shape is obtained from the reconstruction. Changing the function 
Z(z) by a multiplicative factor would be equivalent to changing the surface 
mass density E of the cluster by the inverse of this factor, and hence such a 
constant factor in Z is unobservable due to the mass-sheet degeneracy. Hence, 
not the amplitude of the function Z{z) shown in Fig. 16 is important here, 
but its shape. 

Lombardi & Bertin (1999) have suggested a method to perform cluster 
mass reconstructions and at the same time determine the cosmological pa- 
rameters by minimizing the difference between the shear predicted from the 
reconstructed mass profile and the observed image ellipticities, where the 
former depends on the functional form of Z{z). A nice and simple way to 
illustrate such a method was given in Gautret et al. (2000), called the 'triplet 
method'. Consider three background galaxies which have a small separation 
on the sky, and assume to know the three source redshifts. Because of their 
closeness, one might assume that they all experience the same tidal field and 
surface mass density from the cluster. In that case, the shear of the three 
galaxies is described by five parameters, the two components of 700, k, and 
fi m and Qa- From the six observables (two components of three galaxy el- 
lipticities), one can minimize the difference between the predicted shear and 
the observed ellipticities with respect to these five parameters, and in par- 
ticular obtain an estimate for the cosmological parameters. Repeating this 
process for a large number of triplets of background galaxies, the accuracy 
on the JTs can be improved, and results from a large number of clusters can 
be combined. 

This procedure is probably too simple to be applied in practice; in partic- 
ular, it treats Koc and for each triplet as independent numbers, whereas 
the mass profile of the cluster is described by a single scalar function. How- 
ever, it nicely illustrates the principle. Lombardi & Bertin (1999) have used a 
single density profile Koc (9) of the cluster, but assumed that the mass-sheet 
degeneracy is broken by some other means. Jain & Taylor (2003) suggested a 
similar technique for employing the lensing strength as a function of redshifts 
and cosmological parameters to infer constraints on the latter. Clearly, more 
work is needed in order to turn these useful ideas into a practically applicable 
method. 
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5.3 Inverse methods 

In addition to these 'direct' methods for determining k, inverse methods have 
been developed, such as a maximum-likelihood fit (Bartelmann ct al. 1996; 
Squires & Kaiser 1996) to the data. There are a number of reasons why these 
are in principle preferable to the direct method discussed above. First, in the 
direct methods, the smoothing scale is set arbitrarily, and in general kept 
constant. It would be useful to obtain an objective way how this scale should 
be chosen, and perhaps, that the smoothing scale be a function of position: 
e.g., in regions with larger number densities of sources, the smoothing scale 
could be reduced. Second, the direct methods do not allow additional input 
coming from complementary observations; for example, if both shear and 
magnification information are available, the latter could not be incorporated 
into the mass reconstruction. The same is true for clusters where strong 
lensing constraints are known. 

The shear likelihood function. In the inverse methods, one tries to fit 
a (very general) lens model to the observational data, such that the data 
agree within the estimated errors with the model. In the maximum-likelihood 
methods, one parameterizes the lens by the deflection potential ip on a grid 
and then minimizes the regularized log-likelihood 

_ ln£ = g h ~ g (ft' y»>) I' + 21na i (<UV„}) + A C S({<M) , (62) 

where w er e ^1 — \g(6i, {VVi})| 2 ) [see eq. (15) for the case \g\ < 1 that was 

assumed here] , with respect to these gridded ijj- values; this specific form of the 
likelihood assumes that the intrinsic cllipticity distribution follows a Gaussian 
with width er e . 6 In order to avoid overfitting, one needs a regularization term 
S; entropy regularization (Seitz et al. 1998) seems very well suited (see Bridle 
et al. 1998; Marshall et al. 2002 for alternative regularizations). The entropy 
term S gets large if the mass distribution has a lot of structure; hence, in 
minimizing (62) one tries to match the data as closely as permitted by the 
entropic term (Narayan & Nityananda 1986). As a result, one obtains a model 
as smooth as compatible with the data, but where structure shows up where 

6 This specific form (62) of the likelihood function assumes that the sheared ellip- 
ticity probability distribution follows a two-dimensional Gaussian with mean g 
and dispersion a; note that this assumption is not valid in general, not even when 
the intrinsic ellipticity distribution is Gaussian (see Geiger & Schneider 1999 for 
an illustration of this fact). The exact form of the lensed ellipticity distribution 
follows from the intrinsic distribution p s (e ( - s - ) ) and the transformation law (12) 
between intrinsic and lensed ellipticity, p(e) = p s (e^ (e; <?)) det (de^ /de) . How- 
ever, in many cases the Gaussian approximation underlying (62) is sufficient and 
convenient for analytical considerations. 
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the data require it. The parameter A e is a Langrangean multiplier which sets 
the relative weight of the likelihood function and the regularization; it should 
be chosen such that the \ 2 P er galaxy image is about unity, i.e., 

^ h-g(0 t ,{^})| 2 ^ AT 

^ o?(MiM) g ' 

since then the deviation of the observed galaxy cllipticities from their ex- 
pectation value g is as large as expected from the ellipticity dispersion. This 
choice of the regularization parameter A e then fixes the effective smoothing 
used for the reconstruction. 

Strong lensing constraints can be incorporated into the inverse method 
by adding a term to the log-likelihood function which forces the minimum 
to satisfy these strong constraints nearly precisely. E.g., if a pair of multiple 
images at 9\ and 9 2 is identified, one could add the term 

A s 1/3(0!) - (3{e 2 )\ 2 = A s |[0j - a(0i)] - [0 2 - a{9 2 )]\ 2 

to the log- likelihood; by turning up the parameter A s , its minimum is guar- 
anteed to correspond to a solution where the multiple image constraint is 
satisfied. Note that the form of this 'source-plane minimization' is simplified 
- see Sect. 4.6 of SL - but in the current context this approach suffices. 



Magnification likelihood. Similarly, when accurate number counts of faint 
background galaxies are available, the magnification information can be in- 
corporated into the log-likelihood function. If the number counts behave (lo- 
cally) as a power law, n (> S) oc S~ a , the expected number of galaxies on 
the data field U then is 

(N)=n f d^l/W 1 ; (63) 
Ju 

see (26). The likelihood of observing N galaxies at the positions 9i can then be 
factorized into a term that yields the probability of observing N galaxies when 
the expected number is (N), and one that the ./V galaxies are at their observed 
locations. Since the probability for a galaxy to be at 8i is proportional to the 
expected number density there, n = no /x a_1 , the likelihood function becomes 
(Seitz et al. 1998) 

N 

£ M = p N ((iv));QiM^)r\ (64) 
i=i 

with the first factor yielding the Poisson probability. Note that this expression 
assumes that the background galaxies are unclustered on the sky; in reality, 
where (even faint) galaxies cluster, this factorization does not strictly apply. 

It should be pointed out that the deflection potential tp, and not the 
surface mass density k, should be used as variable on the grid, for two reasons: 
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first, shear and k depend locally on tp, and are thus readily calculated by 
finite differencing from tp, whereas the relation between 7 and k is non-local 
and requires summation over all gridpoints, which is of course more time 
consuming. Second, and more important, the surface mass density on a finite 
field does not determine 7 on this field, since mass outside the field contributes 
to 7 as well. In fact, one can show (Schneider & Bartelmann 1997) that the 
shear inside a circle is fully determined by the mass distribution inside the 
circle and the multipolc moments of the mass distribution outside the circle; 
in principle, the latter can thus be determined from the shear measurement. 

Despite these reasons, some authors prefer to construct inverse methods 
in which the surface mass density on a grid serves as variables (e.g., Bridle 
et al. 1998; Marshall et al. 2002). The fact that the mass density on a finite 
field does not describe the shear in this field is accounted for in these methods 
by choosing a reconstruction grid that is larger than the data field and by 
allowing the surface mass density in this outer region to vary as well. Whereas 
the larger numerical grid requires a larger numerical effort, in addition to the 
non-local relation between k and 7, this is of lesser importance, provided the 
numerical resources are available. Worse, however, is the view that the mass 
distribution outside the data field obtained by this method has any physical 
significance! It has not. This mass distribution is solely one of infinitely many 
that can approximately generate the shear in the data field from mass outside 
the data field. The fact that numerical tests show that one can indeed recover 
some of the mass distribution outside the data field is again a fluke, since these 
models are usually chosen such that all mass distribution outside the field 
in contained in a boundary region around the data field which is part of the 
numerical grid and hence, the necessary 'external' shear must be generated 
by a mass distribution in this boundary zone which by construction is where 
it is. In real life, however, there is no constraint on where the 'external' shear 
contribution comes from. 

5.4 Parameterized mass models 

Whereas the parameter-free mass maps obtained through one of the methods 
discussed above provide a direct view of the mass distribution of a cluster, 
their quantitative interpretation is not straightforward. Peaks in the surface 
mass density can indicate the presence of a mass concentration, or else be 
a peak caused by the ellipticity noise of the galaxies. Since the estimated 
values for k at different locations 8 are correlated [see eq. (49)], it is hard to 
imagine 'error bars' attached to each point. Therefore, it is often preferable 
to use parameterized mass models to fit the observed data; for example, 
fitting shear (and/or magnification) data to an NFW mass profile (see IN, 
Sect. 6.2) yields the virial mass M 2 oo of the cluster and its concentration 
index c. There are basically two methods which have been used to obtain 
such parameterized models. The first one, assuming a spherical mass model, 
orders the tangential component of the observed image cllipticities into radial 
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bins and fits a parameterized shear profile through these bins, by minimizing 
a corresponding x 2 -function. One of the disadvantages of this method is that 
the result of the fitting process can depend on the selected binning, but this 
can be largely avoided by choosing the bins fine enough. This then essentially 
corresponds to minimizing the first term in (62). 

Alternatively, a likelihood method can be used, in which the log-likelihood 
function (62) - without the regularization term - is minimized, with the val- 
ues of the potential on the grid {ipn} replaced by a set of parameters which 
describe the mass profile. Schneider et al. (2000) have used this likelihood 
method to investigate with which accuracy the model parameters of a mass 
profile can be obtained, using both the shear information as well as mag- 
nification information from number counts depletion. One of the surprising 
findings of this study was that the slope of the fitted mass profile is highly de- 
generate if only shear information is used; indeed, the mass-sheet degeneracy 
strikes again and causes even fairly different mass profiles to have very similar 
reduced shear profiles, as is illustrated for a simple example in Fig. 18. In 
Fig. 19, the resulting degeneracy of the profile slope is seen. This degeneracy 
can be broken if number count information is used in addition. As seen in 
the middle panel of Fig. 18, the magnification profiles of the four models dis- 
played are quite different and thus the number counts sensitive to the profile 
slope. Indeed, the confidence regions in the parameter fits, shown in Fig. 19, 
obtained from the magnification information are highly inclined relative to 
those from the shear measurements, implying that the combination of both 
methods yields much better constraints on the model parameters. Of course, 
as mentioned before, the mass-sheet degeneracy can also be broken if redshift 
information of individual background galaxies is available. 

However, in order for the magnification information to yield significant 
constraints on the mass parameters, one needs to know the unlensed number 
density no of sources quite accurately. In fact, even an uncertainty of less than 
~ 10% in the value of no renders the magnification information in relation to 
the shear information essentially useless (in the frame of parameterized mod- 
els). Note that an accurate determination of no is difficult to achieve: since no 
corresponds to the unlensed number density of faint galaxies at the same flux 
limit as used for the actual data field, one requires an accurate photomet- 
ric calibration. A flux calibration uncertainty of 0.1 mag corresponds to an 
uncertainty in no of about ~ 5% for a slope of a = 0.5, and such uncertain- 
ties are likely at the very faint flux limits needed to achieve a high number 
density of sources. In addition, the presence of bright cluster galaxies renders 
the detection and accurate brightness measurement of background galaxies 
difficult and requires masking of regions around them. Nevertheless, in cases 
where only magnification information is available, it can provide information 
on the mass profile by itself. Such a situation can occur for observing condi- 
tions with seeing above ~ 1", when the shear method is challenged by the 
smallness of faint galaxies. 
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Fig. 18. The Einstein radius of a spherical mass 
distribution was assumed to be 9e = 0'.5, and 
the density profile outside the Einstein radius 
was assumed to follow a power law, k{9) = 
a(6/6 E y q ; an SIS would have a = 1/2 and 
9=1. The figure displays for four combinations 
of model parameters the surface mass density 
k(6), the function /it -1 ' 2 , which would be the de- 
pletion factor for source counts of slope (5 = 1/2, 
and the reduced shear g{6). As can be seen, 
whereas the density profiles of the four mod- 
els are quite different, the reduced shear profiles 
are pairwise almost fully degenerate. This is due 
to the mass-sheet degeneracy; it implies that it 
will be difficult to determine the slope q of the 
profiles from shear measurements alone, unless 
much larger fields around the cluster are used 
(from Schneider et al. 2000) 
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The result shown in Fig. 19 implies that the shape of the mass profile 
cannot be very well determined from the shear method, owing to the mass 
sheet degeneracy. This result extends to more general mass profiles than 
power-law models; e.g., King & Schneider (2001) considered NFW models 
with their two parameters c and r2oo- A fairly strong degeneracy between 
these two parameters was found. Furthermore, the mass-sheet degeneracy 
renders it surprisingly difficult to distinguish an isothermal mass model from 
an NFW profile. The ability to distinguish these two families of models in- 
creases with a larger field-of-view of the observations. This expectation was 
indeed verified in King et al. (2002b) where the wide-field imaging data of 
the cluster A 1689 were analyzed with the likelihood method. Although the 
field size is larger than 30', so that the shear profile up to <~ 15' from the 
cluster center can be measured, an NFW profile is preferred with less than 
90% confidence over a power-law mass model. The determination of the mass 
profiles is likely to improve when strong lensing constraints arc taken into 
account as well. 
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Fig. 19. For the power-law models of Fig. 18, confidence regions in the slope q 
and amplitude a are drawn, as derived from the shear (thin solid contours), the 
magnification (dotted) and their combination (thick solid). A number density of 
30/arcmin 2 for shear measurements and 120/arcmin 2 for number counts was as- 
sumed. Thick dashed curves show models with constant total number of galaxies 
in the field, demonstrating that most of the constraint from magnification is due to 
the total counts, with little information about the detailed profile. It was assumed 
here that the unlensed number density of background galaxies is perfectly known; 
the fact that most of the magnification information comes from the total number 
of galaxies in the field implies that any uncertainty in the unlensed number density 
will quickly remove most of the magnification information (from Schneider et al. 
2000) 

The likelihood method for obtaining the parameters of a mass model is 
robust in the sense that the result is only slightly affected by substructure, 
as has been shown by King et al. (2001) using numerically generated cluster 
models. However, if a 'wrong' parameterization of the mass distribution is 
chosen, the interpretation of the resulting best-fit model must proceed care- 
fully, and the resulting physical parameters, such as the total mass, may be 
biased. The principal problems with parameterized models are the same as 
for lens galaxies in strong lensing: unless the parameters have a well-defined 
physical meaning, one does not learn much, even if they are determined with 
good accuracy (see Sect. 4.7 of SL). 

5.5 Problems of weak lensing cluster mass reconstruction and 
mass determination 

In this section, some of the major problems of determining the mass profile of 
clusters from weak lensing techniques are summarized. The finite ellipticity 



62 P. Schneider 



dispersion of galaxies generates a noise which provides a fundamental limit 
to the accuracy of all shear measurements. We will mention a number of 
additional issues here. 

Number 1: The mass-sheet degeneracy. As mentioned several times, 
the major problem is the mass-sheet degeneracy, which implies that there 
is always one arbitrary constant that is undetermined from the shear data. 
Number count depletion can in principle lift this degeneracy, but this magni- 
fication effect has been observed in only a few clusters yet, and as mentioned 
above, this method has its own problems. Employing redshift information of 
individual source galaxies can also break this degeneracy (Bradac et al. 2004) . 
Note that the mass-sheet degeneracy causes quite different mass profiles to 
have very similar reduced shear profiles. 

Source redshift distribution. Since the critical surface mass density S cr 
depends on the source redshift, a quantitative interpretation of the weak lens- 
ing mass reconstruction requires the knowledge of the redshift distribution of 
the galaxy sample used for the shear measurements. Those are typically so 
faint (and numerous) that it is infeasible to obtain individual spectroscopic 
rcdshifts for them. There are several ways to deal with this issue: probably 
the best is to obtain multi-color photometry of the fields and employ photo- 
metric redshift techniques (e.g. Connolly ct al. 1995; Bcmtez 2000; Bolzonella 
et al. 2000). In order for them to be accurate, the number of bands needs to 
be fairly large; in addition, since much of the background galaxy population 
is situated at rcdshifts above unity, one requires near-IR images, as optical 
photometry alone cannot be used for photometric rcdshifts above z ^ 1.3 
(where the 4000 A-break is redshiftcd out of the optical window). The prob- 
lem with near-IR photometry is, however, that currently near-IR cameras 
have a substantially smaller field-of-view than optical cameras; in addition, 
due to the much higher sky brightness for ground-based near-IR observations, 
they extend to brighter flux limits (or smaller galaxy number densities) than 
optical images, for the same observing time. Nevertheless, upcoming wide- 
field near-IR cameras, such as the VISTA project on Paranal or WIRCAM 
at the CFHT, will bring great progress in this direction. 

The alternative to individual redshift estimates of background galaxies is 
to use the redshift distribution obtained through spectroscopic (or detailed 
photometric redshift) surveys in other fields, and identify this with the faint 
background galaxy population at the same magnitude. In this way, the red- 
shift distribution of the galaxies can be estimated. The issues that need to be 
considered here is that neither the targets for a spectroscopic survey, nor the 
galaxy population from which the shear is estimated, are strictly magnitude 
selected. Very small galaxies, for example, cannot be used for a shear esti- 
mate (or are heavily downweighted) owing to their large smearing corrections 
from the PSF. Similarly, for low-surface brightness galaxies it is much harder 
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to determine a spectroscopic redshift. Hence, in these redshift identifications, 
care needs to be excersized. 

For cluster mass reconstructions, the physical mass scale is obtained from 
the average (3 := {D^ s /D s ) over all source galaxies. This average is fairly 
insensitive to the detailed redshift distribution, as long as the mean source 
redshift is substantially larger than the lens redshift. This is typically the 
case for low-redshift (z & 0.3) clusters. However, for higher-redshift lenses, 
determining (3 requires a good knowledge of the galaxy redshift distribution. 

Contamination of the source sample. Next on the list is the contam- 
ination of the galaxy sample from which the shear is measured by cluster 
galaxies; a fraction of the faint galaxies will be foreground objects or faint 
cluster members. Whereas the foreground population is automatically taken 
into account in the normal lensing analysis (i.e., in determining (3), the cluster 
members constitute an additional population of galaxies which is not included 
in the statistical redshift distribution. The galaxy sample used for the shear 
measurement is usually chosen as to be substantially fainter than the brighter 
cluster member galaxies; however, the abundance of dwarf galaxies in clusters 
(or equivalcntly, the shape of the cluster galaxy luminosity function) is not 
well known, and may vary substantially from cluster to cluster (e.g., Tren- 
tham & Tully 2002, and references therein). Including cluster members in 
the population from which the shear is measured weakens the lensing signal, 
since they are not sheared. As a consequence, a smaller shear is measured, 
and a lower cluster mass is derived. In addition, the dwarf contamination 
varies as a function of distance from the cluster center, so that the shape of 
the mass distribution will be affected. Color selection of faint galaxies can 
help in the selection of background galaxies, i.e., to obtain a cleaner set of 
true background galaxies. Of course, cluster dwarfs, if not properly accounted 
for, will also affect the magnification method. One method to deal with this 
problem is to use only galaxies redder than the Red Cluster Sequence of the 
cluster galaxies in the color-magnitude diagram, as this sequence indicates 
the reddest galaxies at the corresponding redshift. 

Accuracy of mass determination via weak lensing. Comparing the 
'true' mass of a cluster with that measured by weak lensing is not trivial, 
as one has to define what the true mass of a cluster is. Using clusters from 
numerical simulations, the mass is defined as the mass inside a sphere of 
radius r2oo around the cluster center within which the overdensity is 200 times 
the critical density of the universe at the redshift considered. When comparing 
this mass with the projected mass inside a circle of radius R = r2oo, one 
should not be surprised that the latter is larger (Metzler et al. 2001), since 
one compares apples (the mass inside a sphere) with oranges (the mass within 
a cylinder). Metzler et al. ascribed this to the mass in dark matter filaments 
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at the intersection of which massive clusters are located, but it is most likely 
mainly an effect of the mass definitions. 

The mass-sheet degeneracy tell us there is little hope to measure the 'total' 
mass of a cluster without further assumptions. Therefore, one natural strat- 
egy is to assume a parameterized mass profile and see how accurately one 
can determine these parameters. The effect of ellipticity noise has already 
been described in Sect. 5.4. Using simulated clusters, Clowe et al. (2004a) 
have studied the effect of asphericity and substructure of clusters on these 
mass parameters, by analyzing the shear field obtained from independent 
projection of the clusters. They find that the non-spherical mass distribu- 
tion and substructure induce uncertainties in the two parameters (J200 and 
the concentration c) of an NWF profile which are larger than those from 
the ellipticity noise under very good observing conditions. Among different 
projections of the same cluster, the value of r2oo has a spread of 10 - 15%, 
corresponding to a spread in virial mass of <~ 40%. Averaging over the differ- 
ent projections, they find that there is little bias in the mass determination, 
except for clusters with very large ellipticity. 

Lensing by the large-scale structure. Lcnsing by foreground and back- 
ground density inhomogeneities (i.e., the LSS), yields a fundamental limit to 
the accuracy of cluster mass estimates. Since lensing probes the projected 
density, these foreground and background inhomogeneities are present in the 
lensing signal. Hoekstra (2003) has investigated this effect in the determina- 
tion of the parameters of an NFW mass profile; we shall return to this issue 
in Sect. 9.2 below when we consider lensing by the large-scale structure. In 
principle, the foreground and background contributions can be eliminated if 
the individual rcdshifts of the source galaxies are known, since in this case a 
three-dimensional mass reconstruction becomes possible (see Sect. 7.6); how- 
ever, the resulting cluster mass map will be very noisy. 

5.6 Results 

After the first detection of a coherent alignment of galaxy images in two 
clusters by Tyson et al. (1990) and the development of the Kaiser & Squires 
(1993) mass reconstruction method, the cluster MS 1224+20 was the first for 
which a mass map was obtained (Fahlman et al. 1994). This investigation of 
the X-ray selected cluster yielded a mass map centered on the X-ray centroid 
of the cluster, but also a surprisingly high M/L-ratio of <~ 800 h (here and 
in following we quote mass-to- light ratios always in Solar units). This high 
M / L ratio has later been confirmed in an independent analysis by Fischer 
(1999). This mass estimate is in strong conflict with that obtained from a 
virial analysis (Carlberg 1994); however, it is known that this cluster has a 
very complex structure, is not relaxed, and most likely a superposition of 
galaxy concentrations in rcdshift. 
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Fig. 20. Contours show the 
mass reconstruction of the 
cluster A1689, obtained from 
data taken with the WFI 
at the ESO/MPG 2.2m tele- 
scope. The image is ~ 33' 
on a side, corresponding to 
~ 4.3 ft -1 Mpc at the cluster 
redshift of = 0.18. In the 
lower panel, the reduced shear 
profile is shown, together with 
the best fitting SIS and NFW 
models. The mass reconstruc- 
tion has been smoothed by 
a l'.15 Gaussian, and contour 
spacing is An = 0.01. No cor- 
rections have been applied to 
account for contamination of 
the lensing signal by cluster 
dwarf galaxies - that would 
increase the mass of the best 
fit models by ~ 25% (taken 
from Clowe & Schneider 2001) 

Since this pioneering work, mass reconstructions of many clusters have 
been performed; see Mellier (1999) and Sect. 5.4 of BS. Here, only a few 
recent results shall be mentioned, followed by a summary. 

Wide-field mass reconstructions. The advent of large mosaic CCD cam- 
eras provides an opportunity to map large regions around clusters to be used 
for a mass reconstruction, and thus to measure the shear profile out to the 
virial radius of clusters. These large-scale observations offer the best promise 
to investigate the outer slope of the mass profile, and in particular distinguish 
between isothermal distributions and those following the NWF profile. Fig. 
20 shows an example of such a mass reconstruction, that of the cluster Abell 
1689 with Zd = 0.182. A significant shear is observed out to the virial radius. 
The mass peak is centered on the brightest cluster galaxy, and the overall 
lens signal is significant at the 13.4-(j level. The shear signal is fit with two 
models, as shown in the lower panel of Fig. 20; the NWF profile yields a 
better fit than an SIS profile. Two more clusters observed with the WFI by 
Clowe & Schneider (2002) yield similar results, i.e., a detection of the lensing 
signal out to the virial radius, and a preference for an NWF mass profile, al- 
though in one of the two cases this preference is marginal. The lensing signal 
of such rich clusters could be contaminated by faint cluster member galaxies; 
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correcting for this effect would increase the estimate of the lensing strength, 
but requires multi-color imaging for source selection. 

The cluster A1689 is (one of) the strongest lensing clusters known (see 
Fig. 14); in fact, it is strong enough so that a weak lensing signal can be 
significantly detected from near-IR images (King et al. 2002a) despite the fact 
that the usable number density of (background) galaxies is only ~ 3 arcmin~ 2 . 
The estimate of its velocity dispersion from weak lensing yields an Einstein 
radius well below the distance of the giant arcs from the cluster center. Hence, 
in this cluster we see a discrepancy between the strong and weak lensing 
results, which cannot be easily explained by redshift differences between the 
arc sources and the mean redshift of the faint galaxies used for the weak 
lensing analysis. On the other hand, A1689 is known to be not a relaxed 
cluster, due to the redshift distribution of its member galaxies. This may 
explain the fact that the weak lensing mass estimates is also lower than that 
obtained from X-ray studies. 

Filaments between clusters. One of the predictions of CDM models for 
structure formation is that clusters of galaxies are located at the intersection 
points of filaments formed by the dark matter distribution. In particular, 
this implies that a physical pair of clusters should be connected by a bridge 
or filament of (dark) matter, and weak lensing mass reconstructions can in 
principle be used to search for them. In the investigation of the z = 0.42 
supercluster MS0302+17, Kaiser et al. (1998) found an indication of a possi- 
ble filament connecting two of the three clusters, with the caveat (as pointed 
out by the authors) that the filament lies just along the boundary of two 
CCD chips; in fact, an indepedent analysis of this supercluster (Gavazzi et 
al. 2004) failed to confirm this filament. Gray et al. (2002) saw evidence for a 
filament connecting the two clusters A901A/901B in their mass reconstruc- 
tion of the A90 1/902 supercluster field. Another potential filament has been 
found in the wide-field mass reconstruction of the field containing the pair of 
clusters A222/223 (Dietrich et al. 2004). Spectroscopy shows that there are 
also galaxies at the same redshift as the two clusters present in the 'filament' 
(Dietrich et al. 2002). 

One of the problems related to the unambiguous detection of filaments 
is the difficulty to define what a 'filament' is, i.e. to device a statistics to 
quantify the presence of a mass bridge. The eye easily picks up a pattern and 
identifies it as a 'filament', but quantifying such a pattern turns out to be 
very difficult, as shown by Dietrich et al. (2004). Because of that, it is difficult 
to distinguish between noise in the mass maps, the 'elliptical' extension of 
two clusters pointing towards each other, and a true filament. However, this 
problem is not specific to the weak investigation: even if the true projected 
mass distribution of a pair of clusters were known (e.g., from a cluster pair 
in numerical simulations), it is not straightforward to define what a filament 
would be. 
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Fig. 21. A deep R-band image of the cluster pair Abell 222/223, obtained from 
two different pointings with the WFI@ESO/MPG 2.2m, with contours showing the 
reconstructed K-map. The two clusters are in the region where the pointings overlap 
and thus deep imaging is available there. Both clusters are obviously detected in 
the mass map, with A223 (the Northern one) clearly split up into two subclusters. 
The mass reconstruction shows a connection between the two clusters which can be 
interpreted as a filament; galaxies at the clusters' redshift are present in this inter- 
cluster region. A further mass concentration is seen about 13' to the South-East 
of A222, which is significant at the 3.5a level and where a clear concentration of 
galaxies is visible. A possible red cluster sequence indicates a substantially higher 
redshift for this cluster, compared to z ^ 0.21 of the double cluster (from Dietrich 
et al. 2004) 
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Correlation between mass and light. Mass reconstructions on wide- 
fields, particularly those covering supercluster regions, are ideally suited to in- 
vestigate the relation between mass and galaxy light. For example, a smoothed 
light map of the color-selected early-type galaxies can be correlated with the 
reconstructed K-map; alternatively, assuming that light traces mass, the ex- 
pected shear map can be predicted from the early-type galaxies and com- 
pared to the observed shear, with the mass-to-light ratio being the essential 
fit parameter. Such studies have been carried out on the aforementioned su- 
percluster fields, as well as on blank fields (Wilson et al. 2001). These studies 
yield very consistent results, in that the mass of clusters is very well traced 
by the distribution of early-type galaxies, but the mass-to- light ratio seems to 
vary between different fields, with ~ 400/i (in solar units) for the 0302 super- 
cluster (Gavazzi et al. 2004), ~ 200ft, for the A901/902 supercluster (Gray et 
al. 2002), and <~ 300/i for empty fields (Wilson et al. 2001) in the rest-frame 
B-band. When one looks in more detail at these supercluster fields, inter- 
esting additional complications appear. The three clusters in the 0302 field, 
as well as the three clusters in the A901/902 field (A901 is indeed a pair of 
clusters) have quite different properties. In terms of number density of color- 
selected galaxies, A901a and A902 dominate the field, whereas only A901b 
seems to be detected in X-rays. Considering early-type galaxies' luminosity, 
A901a is the most prominent of the three clusters. In contrast to this, A902 
seems to be most massive as judged from the weak lensing reconstruction. 
Similar differences between the three clusters in the 0302 field are also seen. 
It therefore appears that the mass-to-light properties of clusters cover quite 
a range. 

Cluster mass reconstructions from space. The exquisite image quality 
that can be achieved with the HST imaging without the blurring effects 
of atmospheric seeing - suggests that such data would be ideal for weak 
lensing studies. This is indeed partly true: from space, the shape of smaller 
galaxy images can be measured than from the ground where the size of the 
seeing disk limits the image size of galaxies that can be used for cllipticity 
measurements in practice. Fig. 22 shows an HST image of the cluster A851 
(zd = 0.41), together with a mass reconstruction. The agreement between 
the mass distribution and the angular distribution of bright cluster galaxies 
is striking. A detailed X-ray observation of this cluster with XMM-Newton 
(De Filippis et al. 2003) finds two extended X-ray components coinciding 
with the two maxima of the bright galaxy distribution, and thus of the mass 
map shown in Fig. 22, in addition to several compact X-ray sources inside 
the HST field. Clearly, this cluster is a dynamically young system, as also 
seen by the inhomogeneitics of the X-ray temperature and metallicity of the 
intracluster gas. 

The drawback of cluster weak lensing studies with the HST is the small 
field-of-view of its WFPC2 camera, which precludes imaging of large regions 
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Fig. 22. The left panel shows an WFPC2QHST image of the cluster C10939+4713 
(=Abell 851; taken from Seitz et al. 1996; the field is about 2'.5 on a side), whereas 
the right panel shows a mass reconstruction obtained by Geiger & Schneider (1999); 
this was obtained using the entropy-regularized maximum likelihood method of 
Seitz et al. (1998). One notices the increased spatial resolution of the resulting 
mass map near the center of the cluster, which this method yields 'automatically' 
in those regions where the shear signal is large. Indeed, this mass map predicts that 
the cluster is critical in the central part, in agreement with the finding of Trager 
et al. (1997) that strong lensing features (multiple images plus an arc) of sources 
with 2 ~ 4 are seen there. The strong correlation between the distribution of mass 
and that of the bright cluster galaxies is obvious: Not only does the peak of the 
mass distribution coincide with the light center of the cluster, but also a secondary 
maximum in the surface mass density corresponds to a galaxy concentration (seen 
in the lower middle), as well as a pronounced minimum on the left where hardly 
any bright galaxies are visible 

around the cluster center. To compensate for this, one can use multiple point- 
ings to tile a cluster. For example, Hoekstra and collaborators have observed 
three X-ray selected clusters with HST mosaics; the results from this survey 
are summarized in Hoekstra et al. (2002d). One example is shown in Fig. 
23, the high-redshift cluster MS1054-03 at z d = 0.83. Also in this cluster 
one detects clear substructure, here consisting of three mass peaks, which is 
matched by the distribution of bright cluster galaxies. The shape of the mass 
maps indicates that this cluster is not relaxed, but perhaps in a later stage 
of merging, a view also supported by its hot X-ray temperature. In fact, new 
observations with Chandra and XMM-Newton of MS 1054 have shown that 
this cluster has a much lower temperature than measured earlier with ASCA 
(Gioia et al. 2004). Only two of the three components seen in the galaxy 
distribution and the mass reconstruction are seen in X-rays, with the central 
weak lensing component being the dominant X-ray source. The newly deter- 
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mined X-ray temperature is consistent with the velocity dispersion of cluster 
galaxies. 




Fig. 23. Mass reconstruction 
f| (contours) of the inner part of 
' ! the high-redshift (z d = 0.83) 
cluster MS1054-03, based on a 
mosaic of six pointings obtained 
with the WFPC2@HST (from 
Hoekstra et al. 2000). The split- 
ting of the cluster core into three 
subcomponents, also previously 
seen from ground-based images 
by Clowe et al. (2000), shows 
that this cluster is not yet relaxed 



Magnification effects. As mentioned in Sect. 2.4, the magnification of a 
lens can also be used to reconstruct its surface mass density (Broadhurst et 
al. 1995). Provided a population of background source galaxies is identified 
whose number count slope a - see (26) - differs significantly from unity, 
local counts of these sources can be turned into an estimator of the local 
magnification. If the lens is weak, (27) provides a relation between the local 
number counts and the local surface mass density. If the lens is not weak, 
this relation no longer suffices, but one needs to use the full expression 

i^Ha-^-N 2 ! , (65) 

where we have written absolute values to account for the fact that the sign of 
the magnification cannot be observed. There are two obvious difficulties with 
(65): the first comes from the sign ambiguities, namely whether /i is positive 
or negative, and whether k < 1 or > 1. Assuming that we are in the region of 
the cluster where fi > and k < 1 (that is, outside the outer critical curve), 
then (65) can be rewritten as 

« = 1 - V^ 1 + M 2 , (66) 

which shows the second difficulty: in order to estimate k from fj,, one needs 
to know the shear magnitude | — y j _ 

There are various ways to deal with this second problem. Consider first 
the case that the (reduced) shear is also observed, in which case one better 
writes 

.= l-[ M (l-|.g| 2 )]- 1/2 ; (67) 
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but of course, if shear measurements are available, they should be combined 
with magnification observations in a more optimized way. A second method, 
using magnification only, is based on the fact that 7 depends linearly on k (ig- 
noring finite- field problems here), and so (66) can be turned into a quadratic 
equation for the k field (Dye & Taylor 1998). From numerical models of 
clusters, van Kampen (1998) claimed that the shear in these clusters approx- 
imately follows on average a relation of the form |— y | = (1 — c)^/k/c, with 
c ~ 0.7; however, there is (as expected) large scatter around this mean rela- 
tion which by itself has little theoretical justification. Fig. 24 shows the mass 
reconstruction of the cluster CI 0024+17 using galaxy number counts and the 
two reconstruction methods just mentioned. 




X (arcmin) 




Fig. 24. Mass reconstruction of the clus- 
ter CI 0024+17 from the magnification 
method. The two different reconstruc- 
tions are based on two different ways to 
turn the magnification signal - number 
count depletions - into a surface mass 
density mass, as described in the text: 
in the upper panel, a local relation be- 
tween surface mass density and shear 
magnitude has been used, whereas in the 
lower panel, the magnification was trans- 
formed into a k map using the (non- 
local) quadratic dependence of the in- 
verse magnification on the surface mass 
density field. Overall, these two recon- 
structions agree very well. To account for 
the presence of bright foreground galax- 
ies, the data field had to be masked 
before local number densities of back- 
ground galaxies were estimated - the 
mask is shown in Fig. 25 (from Dye et 
al. 2002) 



Magnification effects have been observed for a few clusters, most noticibly 
CI 0024+17 (Fort ct al. 1997; Rognvaldsson ct al. 2001; Dye et al. 2002) and 
A1689 (Taylor et al. 1998; Dye et al. 2001). We shall describe some of the 
results obtained for CI 0024+17 as an example (Dye et al. 2002). Since the 
cluster galaxies generate a local overdensity of galaxy counts, they need to 
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be removed first, which can be done based on a color and magnitude crite- 
rion. Comparison with extensive spectroscopy of this cluster (Czoske et al. 
2001) shows that this selection is very effective for the brighter objects. For 
the fainter galaxies - those from which the lcnsing signal is actually mea- 
sured - a statistical subtraction of foreground and cluster galaxies needs to 
be performed, which is done by subtracting galaxies according to the field 
luminosity function with z < and cluster galaxies according to the cluster 
luminosity function. The latter is based on the assumption that the luminos- 
ity distribution of cluster galaxies is independent from the distance to the 
cluster center. Next, the field of the cluster needs to be masked for bright 
objects, near which the photometry of fainter galaxies becomes inaccurate 
or impossible; Fig. 25 shows the masked data field. The number density of 
sources is then determined from the unmasked area. The resulting mass re- 
construction is shown in Fig. 24. The results confirm the earlier finding from 
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Fig. 25. The mask of the data field of 
the cluster CI 0024+17 (grey circles) and 
the location of putative background ob- 
jects (crosses). The inner dashed circle 
shows the critical curve of the cluster 
as derived from the multiply imaged arc 
system (from Dye et al. 2002) 



strong lensing (see Sect. 4.4) that the mass in the inner part of this cluster is 
larger by a factor <~ 3 than estimated from its X-ray emission (Soucail et al. 
2000). 

Magnification and shear method compared. It is interesting to con- 
sider the relative merits of shear and magnification methods for weak lens- 
ing studies. The number of clusters that have been investigated with either 
method are quite different, with less than a handful for which the magnifi- 
cation effect has been seen. The reason for this is multifold. First, the shear 
method does not need external calibration, as it is based on the assumption 
of random source ellipticity; in contrast to this, the magnification method 
requires the number counts of unlensed sources. Whereas this can be ob- 
tained from the same dataset, provided it covers a sufficiently large area, 
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this self-calibration removes one of the strongest appeals of the magnifica- 
tion effect, namely its potential to break the mass-sheet degeneracy. Second, 
the magnification method is affected by the angular correlation of galax- 
ies, as clearly demonstrated by Athreya et al. (2002) in their study of the 
cluster MS 1008— f 224, where the background number counts revealed the 
presence of a background cluster which, if not cut out of the data, would 
contaminate the resulting mass profile substantially. Third, the removal of 
foreground galaxies, and more seriously, of faint cluster members introduces 
an uncertainty in the results which is difficult to control. Finally, the number 
count method yields a lower lensing signal-to-noise than the shear method: 
If we consider iV 7 and galaxies in a given patch of the sky, such that for 
the former ones the ellipticities have been measured, and for the latter ones 
accurate photometry is available and the galaxies are above the photometric 
completeness brightness, the signal-to-noise ratio from the shear - see (15) - 
and number count methods are 

(|) 7 = ^; (§)^ = 2«|a-l|0V (68) 

where we employed (27) in the latter case and assumed that the source galaxy 
positions are uncorrelated. The ratio of these two S/N values is 

(S/N) 7 _ M 1 [K ( 6 Q) 
(S/N)„ k 2a e \l-a\ \j " { ' 

For an isothermal mass profile, the first of these factors is unity. With <r e s=y 0.4 
and a ~ 0.75 for R-band counts, the second factor is <~ 5. The final factor 
depends on the quality of the data: in good seeing conditions, this ratio is of 
order unity. However, when the seeing is bad, the photometric completeness 
level can be considerably fainter than the magnitude for which the shape 
of galaxies can be measured reliably. Therefore, for data with relatively bad 
seeing, the magnification effect may provide a competitive means to extract 
weak lensing information. Having said all of this, the magnification method 
will keep its position as an alternative to shear measurements, in particular 
for future multi-color datasets where the separation of foreground and cluster 
galaxies from the background population can be made more cleanly. 

Summary. The mass reconstruction of clusters using weak lensing has by 
now become routine; quite a few cameras at excellent sites yield data with 
sub-arcsecond image quality to enable this kind of work. Overall, the recon- 
structions have shown that the projected mass distribution is quite similar to 
that of the projected galaxy distribution and the shape of the X-ray emission, 
at least for clusters that appear relaxed. There is no strong evidence for a dis- 
crepancy between the mass obtained from weak lensing and that from X-rays, 
again with exceptions like for C10024+16 mentioned above (which most likely 



74 P. Schneider 



is not a single cluster). The weak lensing mass profiles are considered more 
reliable than the ones obtained from X-ray studies, since they do not rely 
on symmetry or equilibrium assumptions. On the other hand, they contain 
contributions from foreground and background mass inhomogeneities, and 
are affected by the mass-sheet degeneracy. What is still lacking is a combined 
analysis of clusters, making use of weak lensing, X-ray, Sunyaev-Zeldovich, 
and galaxy dynamics measurements, although promising first attempts have 
been published (e.g., Zaroubi et al. 1998, 2001; Reblinsky 2000; Dore et al. 
2001; Marshall et al. 2003). 

5.7 Aperture mass and other aperture measures 

In the weak lensing regime, kCI, the mass-sheet degeneracy corresponds to 
adding a uniform surface mass density kq. However, one can define quantities 
in terms of the surface mass density which are invariant under this transfor- 
mation. In addition, several of these quantities can be determined directly in 
terms of the locally measured shear. In this section we shall present the basic 
properties of the aperture measures, whereas in the following section we shall 
demonstrate how the aperture mass can be used to find mass concentrations 
based solely on their weak lensing properties. 

Aperture mass. Let U {\9\) be a compensated weight (or filter) function, 
meaning J d9 9U(9) = 0, then the aperture mass 



is independent of kq, as can be easily seen. For example, if U has the shape 
of a Mexican hat, M ap will have a maximum if the filter center is centered 
on a mass concentration. The important point to notice is that M ap can be 
written directly in terms of the shear (Kaiser et al. 1994; Schneider 1996) 



where we have defined the tangential component 7 t of the shear relative to 
the point [cf. eq. 17], and 




(70) 




(71) 




o 



(72) 



These relations can be derived from (54) , by rewriting the partial derivatives 
in polar coordinates and subsequent integration by parts (see Schneider & 
Bartelmann 1997); it can also be derived directly from the Kaiser & Squires 
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inversion formula (44), as shown in Schneider (1996). Perhaps easiest is the 
following derivation (Squires & Kaiser 1996): We first rewrite (70) as 

M ap = 2tt f " dtf i? U(d) («(<?)) 

JO 

= 27 r[X(tf) (^Ij'-^^d*^)^, (73) 
where fl u is the radius of the aperture, and we have defined 

X{6) = [ d&dUtf) • 
Jo 

This definition and the compensated nature of U implies that the boundary 
terms in (73) vanish. Making use of (24), one finds that 

dj^ = dR _ dht) _ _ 2 d( 7t ) 
dd di? di? [lt} M ' 

where we used (23) and (24) to obtain dR/d-d = —2 (-f t ) /"&■ Inserting the 
foregoing equation into (73), one obtains 

+ 2tt [x(#) ( 7t mt ^ j o m <7t w> ■ (74) 

The boundary term again vanishes, and one sees that the last equation has 
the form of (71), with the weight function Q = 2X/-d 2 — U, reproducing (72). 

We shall now consider a few properties of the aperture mass, which follow 
directly from (72). 

• If U has finite support, then Q has finite support, which is due to the 
compensated nature of U. This implies that the aperture mass can be 
calculated on a finite data field, i.e., from the shear in the same circle 
where U ^ 0. 

• If U(9) = const, for < 9 < 9- ln , then Q(9) = for the same interval, 
as is see directly from (72). Therefore, the strong lensing regime (where 
7 deviates appreciably from g) can be avoided by properly choosing U 
(and Q). 

. If U(9) = (tt^)- 1 for < 9 < 9 in , U(9) = -[tt(0^ - flfj]" 1 for 9 in < 
9 < flout, and U = for 9 > 9 out , then Q(9) = 9 2 out fl~ 2 [n(9 2 out - <??J] 1 
for 9[ n < 9 < fl ou t, and Q{9) = otherwise. For this special choice of U, 

M ap = R(9 in ) - K(6 in , fl out ) , (75) 

the mean mass density inside fl; n minus the mean density in the annulus 
flin < A < flout (Kaiser 1995). Since the latter is non- negative, this yields 
a lower limit to R(9i n ), and thus to M(fli„). 
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The aperture mass can be generalized to the case where the weight function 
U is constant on curves other than circles, e.g., on ellipses, in the sense 
that the corresponding expressions can be rewritten directly in terms of the 
shear on a finite region (see Squires & Kaiser 1996 for the case where U is 
constant on a set of self-similar curves, and Schneider & Bartelmann 1997 
for a general set of nested curves) . In general, M ap is not a particularly good 
measure for the total mass of a cluster - since it employs a compensated filter 
- but it has been specifically designed that way to be immune against the 
mass-sheet degeneracy. However, M ap is a very convenient measure for mass 
concentrations (see Sect. 5.8) and, as shown above, yields a robust lower limit 
on cluster masses. 

Aperture multipoles. The aperture method can also be used to calculate 
multipolcs of the mass distribution: define the multipoles 

Q {n) := J d 2 9 \6\ n U{\6\) c™^ k(0) , (76) 

then the can again be expressed as an integral over the shear. Here, U is 
a radial weight function for which certain restrictions apply (see Schneider & 
Bartelmann 1997 for details), but is not required to be compensated for n > 0. 
A few cases of interest are: a weight function U which is non-zero only within 
an annulus 9 ln < 9 < 9 out and which continuously goes to zero as 9 — > #i n ,out; 
in this case, the shear is required only within the same annulus. Likewise, if 
U is constant for < 9 < 9 ln and then decreases smoothly to zero at 9 ov ±, 
only the shear within the annulus is required to calculate the multipoles. 
Aperture multipoles can be used to calculate the multipolc moments of mass 
concentrations like clusters directly from the shear, i.e., without obtaining 
first a mass map, which allows a more direct quantification of signal-to-noisc 
properties. 

The cross aperture. We have seen that the Kaiser & Squires inversion, 
given by the first expression in (44), must yield a real result; the imaginary 
part of the integral in (44) vanishes in the absence of noise. Suppose one 
would multiply the complex shear by i = e 2l7r ' 4 ; this would transform the real 
part of the integral into the imaginary part and the imaginary part into the 
negative of the real part. Geometrically, multiplication by this phase factor 
corresponds to rotating the shear at every point by 45°. Hence, if all shears 
arc rotated by 7r/4, the real part of the Kaiser & Squires inversion formula 
(44) yields zero. This 45-dcgrcc test has been suggested by A. Stcbbins; it 
can be used on real data to test whether typical features in the mass map 
are significant, as those should have larger amplitude that spurious features 
obtained from the mass reconstruction in which the shear has been rotated 
by 7r/4 (the corresponding 'mass map' then yields a good indication of the 
typical noise present in the real mass map). 
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One can define in analogy to (71) the cross aperture by replacing the 
tangential component of the shear by its cross component. According to the 
45-degree test, the resulting cross aperture should be exactly zero. Hence, if 
we define for 9 n = 



where <j) is the polar angle of as in (17), then M is expected to be purely 
real. We shall make use of this definition and the interpretation of M in later 
sections. 

5.8 Mass detection of clusters 

Motivation. If a weak lensing mass reconstruction of a cluster has been 
performed and a mass peak is seen, it can also be quantified by applying the 
aperture mass statistics to it: placing the center of the aperture on the mass 
peak, and choosing the radius of the aperture to match the extent of the mass 
peak will give a significant positive value of M ap . Now consider to observe a 
random field in the sky, and to determine the shear in this field. Then, one 
can place apertures on this field and determine M ap at each point. If M ap 
attains a significant positive value at some point, it then corresponds to a 
point around which the shear is tangcntially oriented. Such shear patterns 
are generated by mass peaks according to (70) - hence, a significant peak in 
the M a p-map corresponds to a mass concentration (which can, in principle t 
least, be a mass concentration just in two-dimensional projection, not neces- 
sarily in 3D). Hence, the aperture mass statistics allows us to search for mass 
concentrations on blank fields, using weak lensing methods (Schneider 1996). 
From the estimate (19), we see that the detectable mass concentrations have 
to have typical cluster masses. 

The reason why this method is interesting is obvious: As discussed in Sect. 
6 of IN, the abundance of clusters as a function of mass and rcdshift is an 
important cosmological probe. Cosmological simulations are able to predict 
the abundance of massive halos for a given choice of cosmological param- 
eters. To compare these predictions with observations, cluster samples are 
analyzed. However, clusters are usually detected either as an overdensity in 
the galaxy number counts (possibly in connection with color information, to 
employ the red cluster sequence - see Gladders & Yee 2000), or from extended 
X-ray sources. In both cases, one makes use of the luminous properties of the 
clusters, and cosmologists find it much more difficult to predict those, as the 
physics of the baryonic component of the matter is much harder to handle 
than the dark matter. Hence, a method for cluster detection that is inde- 
pendent of their luminosity would provide a clean probe of cosmology. From 



M := M ap + iM± = / d 2 9 Q(\0\) [ 7t (0) + i 7x (0)] 




(77) 
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what was said above, the aperture mass provides such method (Schneider 
1996). 

To illustrate this point, we show in Fig. 26 the projected mass and the 
corresponding shear field as it results from studying the propagation of light 
rays through a numerically generated cosmological matter distribution (Jain 
et al. 2000; we shall return to such simulations in Sect. 6.6). From the com- 
parison of these two panels, one sees that for each large mass concentration 
there is a tangential shear pattern centered on the mass peak. Thus, a sys- 
tematic search for such shear patterns can reveal the presence and abundance 
of peaks in the mass map. 



The method. The search for mass concentrations can thus be carried out 
by calculating the aperture mass on a gid over the data field and to identify 
significant peaks. A practical estimator for M ap is obtained by replacing the 
integral in (71) by a finite sum over image ellipticities: 

^ap(0 O ) = \Y, £t ^°) - ' ( ?8 ) 

i 

where n is the mean number density of galaxy images, and eti(#o) is the ellip- 
ticity component of a galaxy at 6i tangent to the center 6q of the aperture. 
This estimator has easy-to-quantify signal-to-noise properties. In the absence 
of a lensing signal, \M ap \ = 0, and the dispersion of M ap (0 ) is 



hence, the signal-to-noise of M ap (9 ) is 

s _ V2 Ei^i(0o)Q(\ei-e o \) 



(80) 



The noise depends on do, as the image number density can vary of data field. 
The size (or radius) of the aperture shall be adapted to the mass concen- 
trations excepted: too small aperture radii miss most of the lensing signal of 
real mass concentrations, but is more susceptible to noise peaks, whereas too 
large aperture radii include regions of very low signal which may be swamped 
again by noise. In addition, the shape of the filter function Q can be adapted 
to the expected mass profiles of mass concentrations; e.g., one can design 
filters which are particularly sensitive to NFW-like density profiles. In order 
not to prejudice the findings of a survey, it may be advantageous to use a 
'generic' filter function, e.g., of the form 
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Fig. 26. Projected mass distribution of the large-scale structure (left), and the 
corresponding shear field (right), where the length and orientation of the sticks 
indicate the magnitude and direction of the local shear. The top panels correspond 
to an Einstein-de Sitter model of the Universe, whereas the bottom panels are for 
a low density open model. The size of the field is one degree on the side, and the 
background galaxies are assumed to all lie at the redshift z s — 1. Note that each 
mass concentration seen in the left-hand panels generates a circular shear pattern 
at this position; this form the basic picture of the detection of mass concentrations 
from a weak lensing observation (from Jain et al. 2000) 

The relation between the two expressions for M ap given by (70) and (71) is 
only valid if the aperture lies fully inside the data field. If it docs not, i.e., 
if the aperture crosses the boundary of the data field, these two expressions 
are no longer equivalent; nevertheless, the estimator (78) still measures a 
tangential shear alignment around the aperture center and thus signifies the 
presence of a mass concentration. 
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There are superior estimates of the significance of a detected mass peak 
than using the signal-to-noise ratio (80). One consists in bootstrapping; there 
one calculates M ap at a given point (where N galaxies are in the aperture) 
many times by randomly drawing - with replacement - N galaxies and tests 
how often is signal negative. The fraction of cases with negative values corre- 
sponds to the error level of having a positive detection of M ap . Alternatively, 
one can conduct another Monte-Carlo experiment, by randomizing all galaxy 
image orientations and calculating M ap from these randomized samples, and 
ask in which fraction of realizations is the value of M ap larger than the mea- 
sured value? As the randomized galaxies should show no lensing signal, this 
fraction is again the probability of getting a value as large as that measured 
from random galaxy orientations. In fact, from the central limit theorem one 
expects that the probability distribution of M ap from randomizing the im- 
age orientations will be a Gaussian of zero mean, and its dispersion can be 
calculated directly from (78) to be 

i 

which is similar to (79), but accounts for the moduli of the ellipticity of the 
individual galaxy images. 

Both of the aforementioned methods take the true ellipticity distribution 
of galaxy images into account, and should yield very similar results for the 
significance. Highly significant peaks signify the presence of a mass concen- 
tration, detected solely on the basis of its mass, and therefore, it is a very 
promising search method for clusters. 

There is nothing special about the weight function (81), except mathe- 
matical simplicity. It is therefore not clear whether these filter functions are 
most efficient to detect cluster- mass matter concentrations. In fact, as shown 
in Schneider (1996), the largest S/N is obtained if the filter function U fol- 
lows the true mass profile of the lens or, equivalently, if Q follows its radial 
shear profile. Hennawi & Spergel (2003) and Schirmer (2004) tested a large 
range of filter functions, including (81), Gaussians, and those approximat- 
ing an NFW profile. Based on numerical ray-tracing simulations, Hennawi & 
Spergel conclude that the 'truncated' NFW filter is most efficient for cluster 
detections; the same conclusion has been achieved by Schirmer (2004) based 
on wide-field imaging data. 

Furthermore, Hennawi & Spergel have complemented their cluster search 
by a 'tomographic' component, assuming that the source galaxies have (pho- 
tometric) redshift estimates available. Since the lens strength is a function 
of source redshift, the expected behaviour of the aperture mass signal as a 
function of estimated source redshift can be used as an additional search cri- 
teria. They shown that this additional information increases the sensitivity 
of weak lensing to find mass concentrations, in particular for higher-redshift 
ones; in fact, the cluster search by Wittman et al. (described below) has em- 
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ployed the use of redshift information. As an additional bonus, this method 
also provides an estimate of the lens redshift. 

Results. In the past few years, a number of clusters and/or cluster candi- 
dates have been detected by the weak lensing method, and a few of them 
shall be discussed here. The right-hand panel of Fig. 27 shows the mass re- 
construction of one of the 50 FORSl@VLT fields observed in the course of 
a cosmic shear survey (see Sect. 7.1). This reconstruction shows an obvious 
mass peak, indicated by a circle. The left panel shows the optical image, and 
it is obvious that the location of the mass peak coincides with a concentration 
of bright galaxies - this certainly is a cluster, detected by its weak lensing sig- 
nal. However, no follow-up observations have been conducted yet to measure 
its redshift. 
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Fig. 27. A cosmic shear survey was carried out with the FORS1 instrument on 
the VLT (see Maoli et al. 2001 and Sect. 7.1 below). The left panel shows one of 
the 50 fields observed in the course of this survey, whereas the right panel shows 
a weak-lensing mass reconstruction of this field. Obviously, a strong mass peak is 
detected in this reconstruction, indicated by the circle. At the same position, one 
finds a strong overdensity of relatively bright galaxies on the VLT image; therefore, 
this mass peak corresponds to a cluster of galaxies. A reanalysis of all 50 VLT fields 
(Hetterscheidt 2003) yielded no further significant cluster candidate; however, with 
a field size of only ~ 6'.5, detecting clusters in them is difficult unless these are 
positioned close to the field centers 

Wittman et al. (2001, 2003) reported on the discovery of two clusters 
from their wide-field weak lensing survey; one of them is shown in Fig. 28 
and discussed here. First, a peak in their mass reconstruction was identified 
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which has a significance of 4.5a. The location of the mass peak is identified 
with a concentration of red elliptical galaxies, with the two centers separated 
by about 1' (which is about the accuracy with which the centers of mass 
concentrations are expected to be determined from mass reconstructions). 
Follow-up spectroscopy confirmed the galaxy concentration to be a cluster at 
redshift — 0.28, with a velocity dispersion of a v ~ 600km/s. Since multi- 
color photometry data are available, photometric redshift estimates of the 
faint galaxy population have been obtained, and the tangential shear around 
the mass peak has been investigated as a function of this estimated redshift. 
The lens signal rises as the redshift increases, as expected due to the lensing 
efficiency factor D^ s /D s . In fact, from the source redshift dependence of the 
lens signal, the lens redshift can be estimated, and yields a result within 
~ 0.03 of the spectroscopically measured Hence, in this case not only can 
the presence of a cluster be inferred from weak lensing, but at the same time 
a cluster redshift has been obtained from lensing observations alone. This is 
one example of using source redshift information to investigate the redshift 
structure of the lensing matter distribution; we shall return to a more general 
discussion of this issue in Sect. 7.6. 




Fig. 28. Left: BTC image of a blank field, right: mass reconstruction, showing the 
presence of a (mass-selected) cluster near the lower right corner - spectroscopically 
verified to be at z = 0.276 (from Wittman et al. 2001) 



In a wide- field imaging weak lensing survey of galaxy clusters, Dahle et 
al. (2003) detected three significant mass peaks away from the clusters that 
were targeted. One of these cases is illustrated in Fig. 29, showing the mass 
reconstruction in the field of the cluster A 1705. The mass peak South- West 
of the cluster coincides with a galaxy concentration at z ~ 0.55, as estimated 
from their color, and an arc is seen near the brightest galaxy of this cluster. A 
further cluster was detected in the wide- field image of the A222/223 double 
cluster field (Dietrich et al. 2004) which coincides with an overdensity of 
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galaxies. Hence, by now of order ten cluster-mass matter concentrations have 
been discovered by weak lensing techniques and verified as genuine clusters 
from optical photometry and, for some of them, spectroscopy. 

Fig. 29. Shown is the mass recon- 
struction of the field containing the 
cluster A1705, located near the cen- 
ter of this field. The peak ~ 4' to 
the North-East of A1705 appears to 
be associated with galaxies at the 
same redshift as A1705. However, 
the peak ~ 4' South- West of A1705 
seems to be associated with galax- 
ies at considerably larger redshift, at 
2 ~ 0.55 ± 0.05, as determined from 
the V — I colors of the corresponding 
galaxy concentration. Indeed, an arc 
curving around the central galaxy of 
this newly detected cluster candidate 
is observed (from Dahle et al. 2003) 

Miyazaki et al. (2002) used a 2.1 deg 2 deep image taken with the Suprime- 
Cam wide-field imager on Subaru to search for mass peaks. They compared 
their peak statistics with both, the expected peak statistics from a noise field 
created by intrinsic galaxy ellipticities (Jain & van Waerbeke 2000) as well 
as from N-body simulations, and found a broader distribution in the actual 
data. They interpret this as statistical evidence for the presence of mass 
peaks; however, their interpreation of the significant dips in the mass map as 
evidence for voids cannot hold, as the density contrast of voids is too small 
(since the fractional density contrast S > —1) to be detectable with weak 
lensing. They find a number density of > 5cr peaks of about 5 deg -2 , well in 
agreement with predictions from Kruse & Schneider (1999) and Reblinsky et 
al. (1999). Schirmer (2004) investigated about 16 deg 2 of images taken with 
the WFI@ESO/MPG 2.2m, and detected 100 > 4cr-peaks, again in good 
agreement with theoretical expectations. 

Dark clusters? In addition, however, this method has the potential to dis- 
cover mass concentrations with very large mass-to- light ratio, i.e., clusters 
which are very faint optically and which would be missed in more conven- 
tional surveys for clusters. Two potential 'dark clusters' have been reported 
in the literature. 7 Umetsu & Futamase (2000), using the WFPC2 onboard 

7 A third case reported in Miralles et al. (2002) has in the meantime been consid- 
erably weakened (Erben et al. 2003). 
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HST detected a highly significant (4.5ct) mass concentration 1'.7 away from 
the cluster CI 1604+4304, also without an apparent overdensity of associated 
galaxies. 

In the course of a wide-field weak lensing analysis of the cluster A 1942, Er- 
ben et al. (2000) detected a mass peak which, using the aperture mass statis- 
tics introduced previously, has been shown to be highly significant (~ 4.7a on 
the V-band image), with the significance being obtained from the random- 
ization and bootstrapping techniques described above. An additional I-band 
image confirmed the presence of a mass peak at the same location as on the 
V-band image, though with somewhat lower significance. No concentration 
of galaxies is seen near the location of the mass peak, which indicates that it 
cither is a very dark mass concentration, or a cluster at a fairly high rcdshift 
(which, however, would imply an enormous mass for it), or, after all, a statis- 
tical fluke. It is important to note that the signal in M ap comes from a range 
of radii (see Fig. 30); it is not dominated by a few highly flattened galaxies 
which happen to have a fortitious orientation. Gray et al. (2000) have used 
near-IR images to search for a galaxy concentration in this direction, with- 
out finding an obvious candidate. Therefore, at present it is unclear whether 
the 'dark clump' is indeed a very unusual cluster. A low-significance X-ray 
source near its position, as obtained in a ROSAT observation of A 1942, cer- 
tainly needs confirmation by the more sensitive X-ray observatory XMM. 8 Of 
course, if there are really dark clusters, their confirmation by methods other 
than weak lensing would be extremely difficult; but even if we are dealing 
with a statistical fluke, it would be very important to find the cause for it. An 
HST mosaic observation of this field has been conducted; a first analysis of 
these data was able to confirm the findings of Erben et al., in the sense that 
the shear signal from galaxies seen in both, the HST images and the ground- 
based data, have a significant tangential alignment (von dcr Linden 2004). 
However, contrary to expectations if this was truly a lensing mass signal, 
there is hardly any tangential alignment from fainter galaxies, although they 
are expected to be located at higher redshift and thus should show a stronger 
shear signal. However, as a word of caution, the PSF anisotropy of WFPC2 
cannot be controlled from stars on the image, owing to the small field-of- 
view, and no stellar cluster has been observed with the filter with which the 
dark clump observations were conducted, so that the PSF anisotropy cannot 
be accurately inferred from such calibration images. The existence of dark 
clusters would be highly unexpected in view of our current understanding 
of structure formation and galaxy evolution, and would require revisions of 
these models. 

The search for clusters by weak lensing will certainly continue, due to 
the novel properties of the cluster samples obtained that way. The observa- 
tional data required are the same as those used for cosmic shear studies, and 

8 Judging from the results of several proposal submissions, people on X-ray TACs 
seem not to care too much about dark cluster candidates. 
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Fig. 30. Tangential shear profile 
from both (V- and I-band) images 
around the 'dark cluster' candi- 
date near the cluster A1942. For 
each angular scales, two points 
(and corresponding error bars) 
are plotted, which are derived 
from two different images of the 
field in the V- and I-band. It can 
be seen that the tangential shear 
signal extends over quite a range 
in radius (from Erben et al. 2000) 



200 



several very wide-field surveys are currently conducted, as will be described 
in Sect. 7. Hence, we can expect to have a sizable sample of shear-selected 
clusters in the near future. The search for mass concentrations by weak lens- 
ing techniques is affected by foreground and background inhomogeneities, 
which impose fundamental limits on the reliability and completeness of such 
searches; we shall return to this issue in Sect. 9.2. 

Expectations. Kruse & Schneider (1999) have calculated the expected num- 
ber density of lensing-detected clusters, using the aperture- mass method, for 
different cosmological parameters; these have been verified in numerical sim- 
ulations of the large-scale structure by Reblinsky et al. (1999). Depending on 
the cosmological model, a few clusters per deg 2 should be detected at about 
the 5cr level. The dependence of the expected number density of detectable 
mass peaks on the cosmological parameters can be used as a cosmological 
probe; in particular, Bartelmann et al. (2002) and Weinberg & Kamionkowski 
(2003) demonstrate that the observed abundance of weak lensing clusters can 
probe the equation-of-state of the dark energy. Bartelmann et al. (2001) ar- 
gued that the abundance of weak lensing detected clusters strongly depends 
on their mass profile, with an ordcr-of-magnitude difference between NFW 
profiles and isothermal spheres. Weinberg & Kamionkowski (2002) argued, 
based on the spherical collapse model of cluster formation, that a consider- 
able fraction of such detections are expected to be due to non-virialized mass 
concentrations, which would then be considerably weaker X-ray emitters and 
may be candidates for the 'dark clusters'. 
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6 Cosmic shear - lensing by the LSS 

Up to now we have considered the lensing effect of localized mass concentra- 
tions, like galaxies and clusters. In addition to that, light bundles propagating 
through the Universe arc continuously deflected and distorted by the gravita- 
tional field of the inhomogeneous mass distribution, the large-scale structure 
(LSS) of the cosmic matter field. This distortion of light bundles causes shape 
and size distortions of images of distant galaxies, and therefore, the statistics 
of the distortions reflect the statistical properties of the LSS (Gunn 1967; 
Blandford et al. 1991; Miralda-Escude 1991; Kaiser 1992). 

Cosmic shear deals with the investigation of this connection, from the 
measurement of the correlated image distortions to the inference of cosmo- 
logical information from this distortion statistics. As we shall see, cosmic 
shear has become a very important tool in observational cosmology. From a 
technical point-of-view, it is quite challenging, first because the distortions 
are indeed very weak and therefore difficult to measure, and second, in con- 
trast to 'ordinary' lensing, here the light deflection does not occur in a 'lens 
plane' but by a 3-D matter distribution, implying the need for a different de- 
scription of the lensing optics. We start by looking at the description of light 
propagating through the Universe, and then consider the second-order statis- 
tical properties of the cosmic shear which reflect the second-order statistical 
properties of the cosmic matter field, i.e., the power spectrum. Observational 
results from cosmic shear surveys are presented in Sect. 7, whereas higher- 
order statistical properties of the shear field will be treated in Sect. 9. 

6.1 Light propagation in an inhomogeneous Universe 

In this brief, but rather technical section, we outline the derivation of the 
lensing effects of the three-dimensional mass distribution between the faint 
background galaxy population and us; the reader is referred to Bartelmann 
& Schneider (2001) for a more detailed discussion. The final result of this 
consideration has a very simple interpretation: in the lowest-order approxi- 
mation, the 3-D cosmological mass distribution can be considered, for sources 
at a single rcdshift z s , as an effective surface mass density k, just like in or- 
dinary lensing. The resulting k is obtained as a line-of-sight integral of the 
density contrast Ap, weighted by the usual geometrical factor entering the 
lens equations. 

The laws of light propagation follow from Einstein's General Relativity; 
according to it, light propagates along the null-geodesics of the space-time 
metric. As shown in SEF (see also Seitz et al. 1994), one can derive from 
General Relativity that the governing equation for the propagation of thin 
light bundles through an arbitrary space-time is the equation of geodesic 
deviation, 

d 2 £ 
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where £ is the separation vector of two neighboring light rays, A the afhne 
parameter along the central ray of the bundle, and T is the optical tidal matrix 
which describes the influence of space-time curvature on the propagation of 
light. T can be expressed directly in terms of the Riemann curvature tensor. 

For the case of a weakly inhomogeneous Universe, the tidal matrix can be 
explicitly calculated in terms of the peculiar Newtonian potential. For that, 
we write the slightly perturbed metric of the Universe in the form 



ds 2 



a 2 (r) 



1 



1 



2<P 



(dw 2 + f K {w)&iu 2 



(84) 



where w is the comoving radial distance, a = (1 + z)~ l the scale factor, 
normalized to unity today, r is the conformal time, related to the cosmic 
time t through At — a dr, /a'(w) is the comoving angular diameter distance, 
which equals w in a spatially flat model, and <£(x, w) denotes the Newtonian 
peculiar gravitational potential which depends on the comoving position vec- 
tor x and cosmic time, here expressed in terms of the comoving distance w 
(see Sect. 4 of IN for a more detailed description of the various cosmological 
terms) . In this metric, the tidal matrix T can be calculated in terms of the 
Newtonian potential <P, and correspondingly, the equation of geodesic devi- 
ation (83) yields the evolution equation for the comoving separation vector 
x(#, w) between a ray separated by an angle 6 at the observer from a fiducial 
ray 



P(w) = 




Fig. 31. Illustration of the evolution of the separation between two light rays in a 
curved space-time (source: T. Schrabback) 
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V±<£ (x(0, w), w) - Vi<£ (0) (to) 



(85) 



d 2 x 

dw 2 

where K = (Hq/c) 2 (J? m + Qa — 1) is the spatial curvature of the Uni- 
verse, Vj_ = (8/8x1,8/8x2) is the transverse comoving gradient operator, 
and {p( )(u>) is the potential along the fiducial ray. 9 The formal solution of 

9 In some of the literature, this transport equation is written without the term 
accounting for the potential along the fiducial ray. The idea behind this is to 
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this transport equation is obtained by the method of Green's function, to 
yield 

2 f w r 
x(0,io) = f K {w)0 — 2/ dw' f K (w-w') V±${x(9,w'),w') - V ± <Z> (0) (w 
c Jo L 

(86) 

A source at comoving distance w with comoving separation x from the fiducial 
light ray would be seen, in the absence of lensing, at the angular separation 
(3 = x/ fx(w) from the fiducial ray (this statement is nothing but the defi- 
nition of the comoving angular diameter distance). Hence, (3 is the unlensed 
angular position in the 'comoving source plane' at distance w, where the ori- 
gin of this source plane is given by the intersection point with the fiducial 
ray. Therefore, in analogy with standard lens theory, we define the Jacobian 
matrix 

and obtain from (86) 

AjtfM = <*«-! f W dw' f«( w - w ')fK( w ') $ ( x (0, «/),«/) A kj (0,w') , 
c Jo Jk{w) 



which describes the locally linearized mapping introduced by LSS lensing. To 
derive (88), we noted that V ±<P^ does not depend on 0, and used the chain 
rule in the derivative of <P. This equation still is exact in the limit of validity 
of the weak-field metric. Next, we expand A in powers of <P, and truncate the 
series after the linear term: 

A t3 (9, w) = 6u \ f <W M"')JW) $ (Mu/) ; w ' } . (89) 

c Jo JK{W) 

Hence, to linear order, the distortion can be obtained by integrating along 
the unperturbed ray x = fxiw) 0; this is also called the Born approximation. 
Corrections to the Born approximation are necessarily of order <P 2 . Through- 
out this article, we will employ the Born approximation; later, we will com- 
ment on its accuracy. If we now define the deflection potential 

«•■•) : = i lt)lt) t(lKiw ' )e ' w,) ■ m 



compare a light ray in the inhomogeneous universe with one in the homogeneous, 
unperturbed universe. Apart from the conceptual difficulty, this 'first-order ex- 
pansion' is not justified, as the light rays in an inhomogeneous universe can devi- 
ate quite significantly from straight rays in the homogeneous reference universe - 
much more than the lenght scale of typical density fluctuations. These difficulties 
are all avoided if one starts from the exact equation of geodesic deviation, as 
done here. 
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then Aij = Sij — ip t ij, just as in ordinary lens theory. In this approximation, 
lensing by the 3-D matter distribution can be treated as an equivalent lens 
plane with deflection potential ip, mass density k — V 2 xp/2, and shear 7 = 
{ip, 11 ~ i>, 22)/ 2 + iV\i2- 



6.2 Cosmic shear: the principle 

The effective surface mass density. Next, we relate k to fractional density 
contrast 6 of matter fluctuations in the Universe; this is done in a number of 
steps: 

f . To obtain k = \/ 2 ip/2, take the 2-D Laplacian of tp, and add the term #33 
in the resulting integrand; this latter term vanishes in the line-of-sight 
integration, as can be seen by integration by parts. 

2. We make use of the 3-D Poisson equation in comoving coordinates 

W 2 <P='^i^S (91) 
2a 



to obtain 

2c 2 J f K (w) a(w') 



Note that n is proportional to J7 m , since lensing is sensitive to Ap oc fi m S, 
not just to the density contrast S = Ap/p itself. 
3. For a redshift distribution of sources with p z (z)dz = p w (w)dw, the ef- 
fective surface mass density becomes 

k(6) = J dw p w (w) k(0 , w) 

3g 2 ^ m p, . w , , 5(f K (w)0,w) 

= ^c^ I dw 9{W) fK{w) a~(w) ' (93) 

with 

9M= rdw'p w ( W y fK l W '~ w) , (94) 

Jw JK(W ) 

which is the source-redshift weighted lens efficiency factor D& s /D s for a 
density fluctuation at distance w, and Wh is the comoving horizon dis- 
tance, obtained from w(a) by letting a — > 0. 

The expression (92) for the effective surface mass density can be interpreted 
in a very simple way. Consider a redshift interval of width dz around z, corre- 
sponding to the proper radial distance interval dD prop = \cdt\ = H^ 1 (z)(l + 
z)^ 1 cdz. The surface mass density in this interval is ApdD pmpi where only 
the density contrast Ap — p—p acts as a lens (the 'lensing effect' of the mean 
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matter density of the Universe is accounted for by the relations between an- 
gular diameter distance and redshift; see Schneider & Weiss 1988a). Dividing 
this surface mass density by the corresponding critical surface mass density, 
and integrating along the line-of-sight to the sources, one finds 

This expression is equivalent to (92), as can be easily shown (by the way, 
this is a good excersize for practicing the use of cosmological quantities like 
redshift, distances etc.). 



Limber's equation. The density field 5 is assumed to be a realization of a 
random field. It is the properties of the random field that cosmologists can 
hope to predict, and not a specific realization of it. In particular, the second- 
order statistical properties of the density field are described in terms of the 
power spectrum (see IN, Sect. 6.1). We shall therefore look at the relation 
between the quantities relevant for lensing and the power spectrum P$(k) of 
the matter distribution in the Universe. The basis of this relation is formed 
by Limber's equation. If S is a homogeneous and isotropic 3-D random field, 
then the projections 

g t (0) = J dw qi (w)S(f K (w)0,w) (96) 

also are (2-D) homogeneous and isotropic random fields, where the qi are 
weight functions. In particular, the correlation function 

Cu = (.9i(¥»i)ff 2 (¥» 2 )> =Cia{\v>i - V 2 I) ( 97 ) 

depends only on the modulus of the separation vector. The original form of 
the Limber (1953) equation relates C12 to the correlation function of 5 which 
is a line-of-sight projection. Alternatively, one can consider the Fourier-space 
analogy of this relation: The power spectrum P12 (I) - the Fourier transform 
of C 12 {9) - depends linearly on P s (k) (Kaiser 1992, 1998), 

W= fd W q -^^Ps(^, W ) , (98) 

if the largest-scale structures in S are much smaller than the effective range 
Aw of the projection. Hence, we obtain the (very reasonable) result that the 
2-D power at angular scale l/£ is obtained from the 3-D power at length scale 
Ik{w) (1/^)i integrated over w. 

Comparing (93) with (98), one sees that k(6) is such a projection of S 
with the weights qi(w) = g 2 (uO = (3/2)(H /c) 2 i7 m g(w)fK(w)/a(w), so that 

4c 4 J a 2 (w) \fK(w) J 
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The power spectrum P K , if observable, can therefore be used to constrain 
the 3-D power spectrum P&. For a number of cosmological models, the power 
spectrum P K {£) is plotted in Fig. 32. Predictions of P K are plotted both for 
assuming linear growth of the density structure (see Sect. 6.1 of IN), as well 
as the prescription of the fully nonlinear power spectrum as given by the fit- 
ting formulae of Peacock & Dodds (1996). From this figure one infers that the 
nonlinear evolution of the density fluctuations becomes dominant for values 
of £ *J 200, corresponding to an angular scale of about 30'; the precise values 
depend on the cosmological model and the redshift distribution of the sources. 
Furthermore, the dimensionless power spectrum £ 2 P K (£), that is, the power 
per logarithmic bin, peaks at around £ ~ 10 4 , corresponding to an angular 
scale of <~ 1', again somewhat depending on the source redshift distribution. 
Third, one notices that the shape and amplitude of P K depends on the values 
of the cosmological parameters; therefore, by measuring the power spectrum, 
or quantities directly related to it, one can constrain the values of the cosmo- 
logical parameters. We consider next appropriate statistical measures of the 
cosmic shear which are directly and simply related to the power spectrum 
P K . 




Fig. 32. The power spectrum P K (£) (left panel) and its dimensionless form £ 2 P K {£) 
(right panel) for several cosmological models (where here, £ is denoted by s) . Specif- 
ically, EdS denotes an Q m — 1, Qa = Einstein-de Sitter model, OCDM an open 
tt m = 0.3, Qa = Universe, and ACDM a flat, low-density tt m = 0.3, Qa = 0.7 
model. Numbers in parenthesis indicate (-Tspcct, o"s), where -T spcc t is the shape pa- 
rameter of the power spectrum (see IN, Sect. 6.1) and as is the power-spectrum 
normalization. For these power spectra, the mean redshift of the galaxy distribution 
was assumed to be (z s ) = 1.5. Thin curves show the power spectra assuming linear 
evolution of the density fluctuations in the Universe, and thick curves use the fully 
non-linear evolution, according to the prescription of Peacock & Dodds (1996). For 
angular scales below ~ 30', corresponding to £ > 200, the non- linear evolution of 
the power spectrum becomes very important (from Schneider et al. 1998a) 
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6.3 Second-order cosmic shear measures 

We will now turn to statistical quantities of the cosmic shear field which are 
quadratic in the shear, i.e., to second-order shear statistics. Higher-order sta- 
tistical properties, which already have been detected in cosmic shear surveys, 
will be considered in Sect. 9. As we shall see, all second-order statistics of 
the cosmic shear yield (filtered) information about, and are fully described 
in terms of P K . The most-often used second-order statistics are: 

• The two-point correlation function(s) of the shear, £±(#), 

• the shear dispersion in a (circular) aperture, (I7I 2 ) (9), and 

• the aperture mass dispersion, (Mf p ) (9). 

Those will be discussed next, and their relation to P K (£) shown. As a prepa- 
ration, consider the Fourier transform of k, 

k{£) = J d 2 9c i£ k(9) ; (100) 

then, 

{k(£)k*(£')) = (2tt) 2 5 D (£ - t) P K (£) , (101) 

which provides another definition of the power spectrum P K [compare with 
cq. (123) of IN]. The Fourier transform of the shear is 

m = ^~y^ 2 ) m = k{t) , (102) 

where (3 is the polar angle of the vector £; this follows directly from (42) and 
(43). Eq. (102) implies that 

(7(£)7* (?)) = (2^) 2 S D (£ - £') P K (l). (103) 

Hence, the power spectrum of the shear is the same as that of the surface 
mass density. 



Shear correlation functions. Consider a pair of points (i.e., galaxy im- 
ages); their separation direction ip (i.e. the polar angle of the separation vector 
9) is used to define the tangential and cross-component of the shear at these 
positions for this pair, j t — —Tie e~ 2lv ) , -f x = -Im (7e^ 2¥ ), as in (17). 
Using these two shear components, one can then define the correlation func- 
tions (7t7t) (9) and (7x7x) {9), as well as the mixed correlator. However, it 
turns out to be more convenient to define the following combinations, 

£±(0) = (7t7t) (9) ± (7x7x> (9) , £x(0) = ( 7 t7x> (6) . (104) 

Due to parity symmetry, £ x (#) is expected to vanish, since under such a 
transformation, j t — > 7t, but j x — > — -f x . Next we relate the shear correlation 



Weak Gravitational Lensing 93 



functions to the power spectrum P K : Using the definition of £±, replacing 7 
in terms of 7, and making use of relation between 7 and k, one finds (e.g., 
Kaiser 1992) 

f°° d££ d££ 

w<?) = y -^u^)p K {£); M*) = y -^j^p^), (105) 

where J„ (x) is the n-th order Bessel function of first kind. £± can be measured 
as follows: on a data field, select all pairs of faint galaxies with separation 
within AO of 9 and then take the average (e t i e t j) over all these pairs; since 

ej = +7(6,), the expectation value of (e^ey) is (7t7t) (#) 5 provided 
source ellipticities are uncorrelated. Similarly, the correlation for the cross- 
components is obtained. It is obvious how to generalize this estimator in the 
presence of a weight factor for the individual galaxies, as it results from the 
image analysis described in Sect. 3.5. 

The shear dispersion. Consider a circular aperture of radius 0; the mean 
shear in this aperture is 7. Averaging over many such apertures, one defines 
the shear dispersion (I7I 2 ) (9). It is related to the power spectrum through 

(|7| 2 ) (°) = ^J M£P K {t) WtbW) , where W TH (v) = (106) 

is the top- hat filter function (see, e.g., Kaiser 1992). A practical unbiased 
estimator of the mean shear in the aperture is 7 = N~ 1 '^ l ^ =1 ei, where 
TV is the number of galaxies in the aperture. However, the square of this 
expression is not an unbiased estimator of (J7| 2 ^, since the diagonal terms of 

the resulting double sum yield additional terms, since E (e^e*) = |7(0j)| 2 +of . 
An unbiased estimate for the shear dispersion is obtained by omitting the 
diagonal terms, 

This expression is then averaged over many aperture placed on the data field. 
Again, the generalization to allow for weighting of galaxy images is obvious. 
Note in particular that this estimator is not positive semi-definite . 

The aperture mass. Consider a circular aperture of radius 9; for a point 
inside the aperture, define the tangential and cross-components of the shear 
relative to the center of the aperture (as before); then define 



M ap (9)= / d 2 tf Q(|tf|) 7t (tf) , 



(108) 
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where Q is a weight function with support e [0,6*]. If we use the function 
Q given in (81), the dispersion of M ap (9) is related to power spectrum by 
(Schneider et al. 1998a) 



1 r°° 

<M a 2 p ) (9) = — J d££P K (£)W ap (9£) , with W ap>1 (»y) := 



576JK??) 



V 4 



(109) 

Crittenden et al. (2002) suggested a different pair U and Q of filter functions, 



2ir 6» 2 



1 - 



26» 2 



(110) 

These function have the disadvantage of not having finite support; however, 
due to the very strong fall-off for # 9, for many practical purposes the 
support can be considered effectively as finite. This little drawback is com- 
pensated by the convenient analytic properties of these filter functions, as 
we shall see later. For example, the relation of the corresponding aperture 
mass dispersion is again given by the first of eqs. (109), but the filter function 
simplifies to 

W ap .2(v) = ! j^ 2 . (Ill) 

Whereas the filter functions which relate the power spectrum to the shear 
correlation functions, i.e., the Bessel function appearing in (105), and to the 
shear dispersion, given by Wth, are quite broad filters, implying that these 
statistics at a given angular scale depend on the power spectrum over a wide 
range of £, the two filter function W aPt i.2 are very localized and thus the 
aperture mass dispersion yields highly localized information about the power 
spectrum (see Bartelmann & Schneider 1999, who showed that replacing the 
filter function W by a delta-'function' causes an error of only ~ 10%). Hence, 
the shape of (M ap ) (8) directly reflects the shape of the power spectrum as 
can also be seen in Fig. 35 below. 



Interrelations. These various 2-point statistics all depend linearly on the 
power spectrum P K ; therefore, one should not be too surprised that they 
are all related to each other (Crittenden et al. 2002). The surprise perhaps 
is that these interrelations are quite simple. First, the relations between £± 
and P K can be inverted, making use of the orthonormality relation of Bessel 
functions: 

P k (£) = 2tt d0 0f+(0) J o (£0) = 2tt / dO 9 (8) J 4 (W) . (112) 
Jo Jo 

Next, we take one of these and plug them into the relation (105) between the 
other correlation function and P K , to find: 

roc i n / n2 \ 

U{e)=H-{0) + j e T M0)(4-12^J; (113) 
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UO) = £+(0) + J q (4 - i2 ¥ j . (114) 

These equations show that the two shear correlation functions arc not inde- 
pendent of each other, the reason for that being that the shear (which itself 
is a two-component quantity) is derived from a single scalar field, namely 
the deflection potential ip. We shall return to this issue further below. Using 
(112) in the equation for the shear dispersion, one finds 

K> M -f^^(|)-jf^ { -(*,5.(|), 

where the S± are simple functions, given explicitly in Schneider et al. (2002a) 
and plotted in Fig. 33. Finally, the same procedure for the aperture mass 




Fig. 33. The function S±(x) and T±(x) 
which relate the shear and aperture mass 
dispersion to the correlation functions. 
Note that S- does not vanish for x > 2, 
as is the case for the other three func- 
tions (from Schneider et al. 2002a) 
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dispersion lets us write 

ww -f«? { , wr+ (l)- J r^ { . wr .(l), ( i», 

again with analytically known functions T±, given for the filter function (81) 
in Schneider et al. (2002a), and for the filter function (110) in Jarvis et 
al. (2003b). Hence, all these 2-point statistics can be evaluated from the 
correlation functions £±(0), which is of particular interest, since they can be 
measured best: Real data fields contain holes and gaps (like CCD defects; 
brights stars; nearby galaxies, etc.) which makes the placing of apertures 
difficult; however, the evaluation of the correlation functions is not affected 
by gaps, as one uses all pairs of galaxy images with a given angular separation. 
Furthermore, it should be noted that the aperture mass dispersion at angular 
scale 9 can be calculated from £±(1?) in the finite range G [0,20], and the 
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shear dispersion can be calculated from £ + on 1) e [0,20], but not from £_ 
on a finite interval; this is due to the fact that £_ on small scales does not 
contain the information of the power spectrum on large scales, because of the 
filter function J4 in (105). 

We also note that from a cosmic shear survey, the power spectrum P K 
can be determined directly, as has been investigated by Kaiser (1998), Sel- 
jak (1998) and Hu & White (2001). This is not done by applying (112), as 
these relations would require the determination of the correlation function 
for all separation, but by more sophisticated methods. A simple example 
(though not optimal) is to consider the measured shear field on the square; 
Fourier transforming it and binning modes in \£\ then yields an estimate of 
the power spectrum, once the power from the intrinsic ellipticity dispersion 
is subtracted. Better methods aim at minimizing the variance of the recon- 
structed power spectrum (Seljak 1998; Hu & White 2001). As mentioned be- 
fore, the aperture mass dispersion is a filtered version of the power spectrum 
with such a narrow filter, that it contains essentially the same information 
as P K over the corresponding angular scale and at t <~ 5/8, provided P K has 
no sharp features. 

6.4 Cosmic shear and cosmology 

Why cosmology from cosmic shear? Before continuing, it is worth to 
pause for a second and ask the question why one tries to investigate cosmo- 
logical questions by using cosmic shear - since it is widely assumed that the 
CMB will measure 'all' cosmological quantities with high accuracy. Partial 
answers to this question are: 

• Cosmic shear measures the mass distribution at much lower redshifts 
(z ^ 1) and at smaller physical scales [R ~ 0.3 h^ 1 {0/1') Mpc] than the 
CMB; indeed, it is the only way to map out the dark matter distribution 
directly without any assumptions about the relation between dark and 
baryonic matter. 

• Cosmic shear measures the non-linearly evolved mass distribution and its 
associated power spectrum Ps(k); hence, in combination with the CMB it 
allows us to study the evolution of the power spectrum and in particular, 
provide a very powerful test of the gravitational instability paradigm for 
structure growth. 

• As was demonstrated by the recent results from the WMAP satellite 
(Bennett et al. 2003), the strongest constraints are derived when combin- 
ing CMB measurements (constraining the power spectrum on large spa- 
tial scales) with measurements on substantially smaller scales, to break 
parameter degeneracies remaining from the CMB results alone (see Sper- 
gel et al. 2003). Hu & Tegmark (1999) have explicitly demonstrated how 
much the accuracy of estimates of cosmological parameters is improved 
when the CMB results from missions like WMAP and later Planck is 
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complemented by cosmic shear measurements (see Fig. 34). In fact, as 
we shall see later, combinations of CMB anisotropy measurements have 
already been combined with cosmic shear measurements (see Fig. 47) and 
lead to substantially improved constraints on the cosmological parame- 
ters. 

• It provides a fully independent way to probe the cosmological model; 
given the revolutionary claims coming from the CMB, SN la, and the LSS 
of the galaxy distribution, namely that more than 95% of the contents in 
the Universe is in a form that we have not the slightest idea about what 
it is (the names 'dark matter' and 'dark energy' reflect our ignorance 
about their physical nature), an additional independent verification of 
these claims is certainly welcome. 

• For a foreseeable future, astronomical observations will provide the only 
possibility to probe the dark energy empirically. The equation of state 
of the dark energy can be probed best at relatively low redshifts, that is 
with SN la and cosmic shear observations, whereas CMB anisotropy mea- 
surements are relatively insensitive to the properties of the dark energy, 
as the latter was subdominant at the epoch of recombination. 

• As we have seen in Sect. 5.8, cosmic shear studies provide a new and 
highly valuable search method for cluster-scale matter concentrations. 



Expectations. The cosmic shear signal depends on the cosmological model, 
parameterized by J7 m , Qa, and the shape parameter P spe ct of the power spec- 
trum, the normalization of the power spectrum, usually expressed in terms 
of us, and the redshift distribution of the sources. By measuring £± over a 
significant range of angular scales one can derive constraints on these param- 
eters. To first order, the amplitude of the cosmic shear signal depends on the 
combination ~ a% ^m 5 > verv similar to the cluster abundance. Furthermore, 
the cosmic shear signal shows a strong dependence on the source redshift dis- 
tribution. These dependencies are easily understood qualitatively: A higher 
normalization og increases P$ on all scales, thus increasing P K . The increase 
with I2 m is mainly due to the prefactor in (99), i.e. due to the fact that 
the light deflection depends on Ap, not just merely on d — Ap/p, as most 
other cosmological probes. Finally, increasing the redshift of sources has two 
effects: first, the lens efficiency Da s /D s — Jk{w s — w)/ ' Jk{w s ) at given dis- 
tance w increases as the sources are moved further away, and second, a larger 
source redshift implies a longer ray path through the inhomogeneous matter 
distribution. 

In Fig. 35 the predictions of the shear dispersion and the aperture mass 
dispersion are shown as a function of angular scale, for several cosmological 
models. The dependencies of the power spectrum P K on cosmological pa- 
rameters and £ is reflected in these cosmic shear measures. In particular, the 
narrow filter function which relates the aperture mass dispersion to the power 
spectrum implies that the peak in £ 2 P K (£) at around £ ~ 10 4 (see Fig. 32) 
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Fig. 34. The improvement of the accuracy of cosmological parameters when sup- 
plementing CMB data from WMAP (upper panel) and the Planck satellite (lower 
panel) by a cosmic shear survey of solid angle 6 2 tv. The accuracies are significantly 
improved, certainly when combined with WMAP, but even in combination with 
Planck, the accuracies of the density parameters can be increased, when using 
next-generation cosmic shear surveys with hundreds of square degrees (from Hu & 
Tegmark 1999) 

translates into a peak of \M% p ) at around 8 ~ V . The non-linear evolution 
of the power spectrum is dominating the cosmic shear result for scales below 
~ 30'; the fact that the non-linear prediction approach the linear ones at 
somewhat smaller scales for the shear dispersion (|7| 2 ) is due to the fact that 
this statistics corresponds to a broad-band filter Wth (106) of P K which in- 
cludes the whole range of small I values, which are less affected by non-linear 
evolution. 

Deriving constraints. From the measured correlation functions £±(0) (or 
any other measure of the cosmic shear, but we will concentrate on the statis- 
tics which is most easily obtained from real data), obtaining constraints 
on cosmological parameters can proceed through maximizing the likelihood 
£(p|£ obs ), which yields the probability for the set of cosmological parameters 
being p, given the observed correlation function £ obs . This likelihood is given 
by the probability P(£ ohs \p) that the observed correlation function is £ obs , 
given the parameters p. For a given set of parameters p, the correlation func- 
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Fig. 35. The square root of the aperture mass dispersion (left) and of the shear 
dispersion (right), for the same cosmological models as were used for Fig. 32, again 
with results from assuming linear growth of structure in the Universe shown as thin 
curves, whereas the fully non-linear evolution was taken into account for the thick 
curves. One sees that the aperture mass signal is considerably smaller than that 
of the shear dispersion; this is due to the fact that the filter function W ap is much 
narrower than Hth; hence, at a given angular scale, (Mf p ^ samples less power than 

(|7| 2 )- However, this also implies that the aperture mass dispersion provides much 
more localized information about the power spectrum than the shear dispersion 
and is therefore a more useful statistics to consider. Other advantages of (Mf p ) 
will be described further below. For scales below ~ 30', the non- linear evolution of 
the power spectrum becomes very important (from Schneider et al. 1998a) 



tion is predicted. If one assumes that the observed correlations £ obs are 
drawn from a (multi-variate) Gaussian probability distribution, then 

V l ; (2^)«/2Vdet Cov y \ 2 
with 

x 2 (p, e hs ) = £ (Up) - C° bs ) Cov^. 1 fo(p) - ef s ) . (lie) 

ij 

Here, the & = are the values of the correlation function(s) (i.e., either 
£±, or using both) in angular bins, n is the number of angular bins in case 
cither one of the £± is used, or if both are combined, twice the number of 
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angular bins, and Cov^ is the inverse of the covariance matrix, which is 
defined as 

Co Vij = ( [& (p) - C° bs ] (p) - C° bs ] ) , (H7) 

where the average is over multiple realizations of the cosmic shear survey 
under consideration. Covy can be determined either from the £± itself, from 
simulations, or estimated from the data in terms of the £± bs (see Schneider et 
al. 2002b; Kilbinger & Schneider 2004, Simon et al. 2004). Nevertheless, the 
calculation of the covariance is fairly cumbersome, and most authors have 
used approximate methods to derive it, such as the field-to-field variations 
of the measured correlation. In fact, this latter approach may be more ac- 
curate than using the analytic expressions of the covariance in terms of the 
correlation function, which are obtained by assuming that the shear field 
is Gaussian, so that the four-point correlation function can be factorized as 
products of two-point correlators. As it turns out, £+(0) is strongly correlated 
across angular bins, much less so for £-(#); this is due to the fact that the 
filter function that describes £ in terms of the power spectrum P K is much 
broader for £ + (namely Jo) than J4 which applies for 

The accuracy with which £± can be measured, and thus the covariance 
matrix, depends on the number density of galaxies (that is, depth and quality 
of the images), the total solid angle covered by the survey, and its geometrical 
arrangement (compact survey vs. widely separated pointings). The accuracy 
is determined by a combination of the intrinsic ellipticity dispersion and the 
cosmic (or sampling) variance. The likelihood function then becomes 

£ <*° ta > " (2„../Vd t ,Cov «** (=^¥^) . <"«> 

where -P pr ior(p) contains prior information (or prejudice) about the parame- 
ters to be determined. For example, the redshift distribution of the sources 
(at given apparent magnitude) is fairly well known from spectroscopic red- 
shift surveys, and so the prior probability for z s would be chosen to be a fairly 
narrow function which describes this prior knowledge on the redshifts. One 
often assumes that all but a few parameters are known precisely, and thus 
considers a restricted space of parameters; this is equivalent to replacing the 
prior for those parameters which are fixed by a delta-'function'. If m param- 
eters are assumed to be undetermined, but one is mainly interested in the 
confidence contours of m! < m parameters, then the likelihood function is in- 
tegrated over the remaining m — m' parameters; this is called marginalization 
and yields the likelihood function for these m! parameters. 

There are two principal contributions to the 'noise' of cosmic shear mea- 
surements. One is the contribution coming from the finite intrinsic ellipticity 
dispersion of the source galaxies, the other due to the finite data fields of any 
survey. This latter effect implies that only a typical part of the sky is mapped, 
whose properties will in general deviate from the average properties of such a 
region in the sky for a given cosmology. This effect is called cosmic variance, 
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or sample variance. Whereas the noise from intrinsic ellipticity dispersions 
dominates at small angular scales, at scales beyond a few arcminutes the 
cosmic variance is always the dominating effect (e.g., Kaiser 1998; White & 
Hu 2000). 

Of course, all of what was said above can be carried over to the other 
second-order shear statistics, with their respective covariance matrices. The 
first cosmic shear measurements were made in terms of the shear dispersion 
and compared to theoretical prediction from a range of cosmological mod- 
els. As is true for the correlation functions, the shear dispersion is strongly 
correlated between different angular scales. This is much less the case for 
the aperture mass dispersion, where the correlation quickly falls off once the 
angular scales differ by more than a factor ~ 1.5 (see Schneider et al. 2002b). 
Even less correlated is the power spectrum itself. These properties are of 
large interest if the results from a cosmic shear survey are displayed as a 
curve with error bars; for the aperture mass dispersion and the power spec- 
trum estimates, these errors are largely uncorrelated. However, for deriving 
cosmological constraints, the correlation function £± are most useful since 
they contain all second-order information in the data, in addition of being 
the primary observable. 

6.5 E-modes, B-modes 

In the derivation of the lensing properties of the LSS, we ended up with 
an equivalent surface mass density. In particular, this implied that A is a 
symmetric matrix, and that the shear can be obtained in terms of k or tp. 
Now, the shear is a 2-component quantity, whereas both k and ip are scalar 
fields. This then obviously implies that the two shear components are not 
independent of each other! 

Recall that (54) yields a relation between the gradient of k and the first 
derivatives of the shear components; in particular, (54) implies that V x u 7 = 
0, yielding a local constraint relation between the second derivatives of the 
shear components. The validity of this constraint equation guarantees that 
the imaginary part of (44) vanishes. This constraint is also present at the 
level of 2-point statistics, since one expects from (112) that 



Hence, the two correlation functions £± are not independent. The observed 
shear field is not guaranteed to satisfy these relations, due to noise, remaining 
systematics, or other effects. Therefore, searching for deviations from this re- 
lation allows a check for these effects. However, there might also be a 'shear' 
component present that is not due to lensing (by a single equivalent thin mat- 
ter sheet k) . Shear components which satisfy the foregoing relations are called 
E-modes; those which don't are B-modes - these names are exported from 
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the polarization of the CMB, which 
as the shear field, namely that of a p 




,s the same mathematical properties 
ir. 



Fig. 36. Sketch of the distinction be- 
tween E- and B-modes of the shear. The 
upper row shows a typical E-mode shear 
pattern coming from a mass overden- 
sity (left) or underdensity (right), yield- 
ing tangential and radial alignment of 
the shear, respectively. The lower row 
shows a B-mode pattern, which is ob- 
tained from the E-mode pattern by ro- 
tating all shears by 45°. Those cannot 
be produced from gravitational lensing 
(from van Waerbeke & Mellier 2003) 



The best way to separate these modes locally is provided by the aperture 
measures: {M 2 V {9)) is sensitive only to E-modes. If one defines in analogy - 
recall (77) 

M X {9) = J d 2 tf Q(|tf|) 7x (tf) , (119) 

then (Mj_(0)) is sensitive only to B-modes. In fact, one can show that for 
a pure E-mode shear field, Mj_ = 0, and for a pure B-mode field, M ap = 0. 
Furthermore, in general (that is, even if a B-mode is present), (M ap ) = 0, 
since (k) — 0, and (Mj_) = 0, owing to parity invariance: a non-zero mean 
value of M± would introduce a net orientation into the shear field. Using the 
same argument, one finds that (M™M™) = for n odd (Schneider 2003). 



E/B-mode decomposition of a shear field. There are a number of 
(equivalent) ways to decompose a shear field into its two modes. One is pro- 
vided by the Kaiser & Squires mass reconstruction (44), which yields, for a 
general shear field, a complex surface mass density k — k e + in B . Another 
separation is obtained by considering the vector field u 7 (0) (54) obtained 
from the first derivatives of the shear components. This vector will in general 
not be a gradient field; its gradient component corresponds to the E-mode 
field, the remaining one to the B-mode. Hence one defines 

V 2 k e = V • u 7 ; V 2 k b = V x u 7 . (120) 

In full analogy with the 'lensing-only' case (i.e., a pure E-mode), one defines 
the (complex) potential ip(8) — i^{0) + iV> B (0) by the Poisson equation 
\7 2 ip = 2k, and the shear is obtained in terms of the complex ijj in the usual 
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way, 



7 = 7i + 172 = (-041 ^ 0,22) /2 + 



12 



^22) 



.12 



^12 + o (Vfll " B 22 ) 



(121) 



On the level of second-order statistics, one considers the Fourier transforms 
of the E- and B-mode convergence, and defines the two power spectra Pe, 
Pb, and the cross-power spectrum Feb by 



(k E (£)k E *(£')) = (2n) 2 6 D (£-f)P E (£) , 
(k B (l)k B *(l')) = (2ir) 2 S D (£-£')P B (£) , 
(k E (£)k B *{£')) = (27r) 2 6 D (£-£')P EB (£) . 



(122) 



From what was said above, the cross power P EB vanishes for parity-symmetric 
shear fields, and we shall henceforth ignore it. The shear correlation functions 
now depend on the power spectra of both modes, and are given as (Crittenden 
et al. 2002; Schneider et al. 2002a) 



U0) 



£°|^Jo(^)[PeW + PbW] , 
f°° All 

J ^J 4 (^)[W)-P B m] • 



Hence, in the presence of B-modes, the correlation function cannot be 
obtained from £ + , as was the case for a pure E-mode shear field. The inverse 
relation (112) now gets modified to 

poo 

p E (i) =tt / dee [z + (0)j o (t6) + s-(0)h(te)] , 

Jo 

poo 

P B {1) =tt / dee [Z+(0)J (16) - £-(0)h(l6)] . (123) 
Jo 

Hence, the two power spectra can be obtained from the shear correlation 
functions. However, due to the infinite range of integration, one would need to 
measure the correlation functions over all angular scales to apply the previous 
equations for calculating the power spectra. Much more convenient for the 
E/B-mode decomposition is the use of the aperture measures, since one can 
show that 

1 f°° 

«,)(*) = 2^ I dllP E (l)W ap (ei), 
1 r°° 

(M\) (e) = — jf d££P B (l) w ap (et) , 



(124) 



so that these two-point statistics clearly separate E- and B-modes. In ad- 
dition, as mentioned before, they provide a highly localized measure of the 
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corresponding power spectra, since the filter function W ap (rj) involved is very 
narrow. As was true for the E-mode only case, the aperture measures can be 
expressed as finite integrals over the correlation functions, 



<M a 2 p ) (0)= 



(Ml) (9) =- 



(125) 



where the two functions T± are the same as in (115) and have been given 
explicitly in Schneider et al. (2002a) for the weight function Q given in (81), 
and in Jarvis et al. (2003) for the weight function (110). Hence, the relations 
(125) remove the necessity to calculate the aperture measures by placing 
apertures on the data field which, owing to gaps and holes, would make this 
an inaccurate and biased determination. Instead, obtaining the correlation 
functions from the data is all that is needed. 

The relations given above have been applied to recent cosmic shear sur- 
veys, and significant B-modes have been discovered (see Sect. 7); the ques- 
tion now is what are they due to? As mentioned before, the noise, which 
contributes to both E- and B-modes in similar strengths, could be under- 
estimated, the cosmic variance which also determines the error bars on the 
aperture measures and which depends on fourth-order statistical properties 
of the shear field could also be underestimated, there could be remaining 
systematic effects, or B-modes could indeed be present. There are two possi- 
bilities known to generate a B-mode through lensing: The first-order in <P (or 
'Born') approximation may not be strictly valid, but as shown by ray-tracing 
simulations through cosmic matter fields (e.g., Jain et al. 2000), the resulting 
B-modes are expected to be very small. Clustering of sources also yields a 
finite B-mode (Schneider et al. 2002a), but again, this effect is much smaller 
than the observed amplitude of the B-modes (see Fig. 37). 



Intrinsic alignment of source galaxies. Currently the best guess for the 
generation of a finite B-mode are intrinsic correlations of galaxy ellipticities. 
Such intrinsic alignments of galaxy ellipticities can be caused by tidal grav- 
itational fields during galaxy formation, owing to tidal interactions between 
galaxies, or between galaxies and clusters. Predictions of the alignment of 
the projected ellipticity of the galaxy mass can be made analytically (e.g. in 
the frame of tidal torque theory) or from numerical simulations; however, the 
predictions from various groups differ by large factors (e.g., Croft & Metzler 
2000; Crittenden et al. 2001; Heavens et al. 2000; Jing 2002) which means 
that the process is not well understood at present. For example, the results 
of these studies depend on whether one assumes that the light of a galaxy is 
aligned with the dark matter distribution, or aligned with the angular mo- 
mentum vector of the dark halo. This is related to the question of whether 
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Fig. 37. The correlation functions £±{0) for a ylCDM model with r spcc t = 0.21 
and us = 1, and a source population with mean redshift of {« s ) = 1.5. Also plotted 
are the corresponding correlation functions that arise separately from the E- and 
B-modes, with the £e+ mode curve coinciding within the line thickness with £+. 
In this calculation, the clustering of the faint galaxy population was taken into 
account, and they give rise to a very small B-mode contribution, as can be seen 
from the £b± curves. The smallness of the B-mode due to intrinsic source clustering 
renders this effect not viable to explain the B-modes observed in some of the cosmic 
shear surveys (from Schneider et al. 2002a) 

the orientation of the galaxy light (which is the issue of relevance here) is the 
same as that of the mass. 

If intrinsic alignments play a role, then 

^ = (e i e*) = (efKf>)+er s , (126) 

and measured correlations £± contain both components, the intrinsic corre- 
lation and the shear. Of course, there is no reason why intrinsic correlations 
should have only a B-mode. If a B-mode contribution is generated through 
this process, then the measured E-mode is most likely also contaminated 
by intrinsic alignments. Given that intrinsic alignments yield ellipticity cor- 
relations only for spatially close sources (i.e., close in 3-D, not merely in 
projection), it is clear that the deeper a cosmic shear survey is, and thus the 
broader the redshift distribution of source galaxies, the smaller is the relative 
amplitude of an intrinsic signal. Most of the theoretical investigations on the 
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strength of intrinsic alignments predict that the deep cosmic shear surveys 
(say, with mean source redshifts of (z s ) ~ 1) are affected at a ~ 10% level, 
but that shallower cosmic shear surveys are more strongly affected; for them, 
the intrinsic alignment can be of same order or even larger than the lensing 
signal. 

However, the intrinsic signal can be separated from the lensing signal 
if rcdshift information of the sources is available, owing to the fact that 
£ (s) £ (. s )*^ w in kg n on-zero only if the two galaxies are at essentially the 

same redshift. Hence, if z-information is available (e.g., photometric red- 
shifts), then galaxy pairs which are likely to have similar redshifts are to 
be avoided in estimating the cosmic shear signal (King & Schneider 2002; 
Heymans & Heavens 2002, Takada & White 2004). This will change the ex- 
pectation value of the shear correlation function, but in a controllable way, as 
the redshifts are assumed to be known. Indeed, using (photometric) redshifts, 
one can simultaneously determine the intrinsic and the lensing signal, essen- 
tially providing a cosmic shear tomography (King & Schneider 2003). This 
again is accomplished by employing the fact that the intrinsic correlation can 
only come from galaxies very close in redshift. Hence, in the presence of in- 
trinsic alignments, the redshift dependent correlation functions £±(zi, zi\&) 
between galaxies with estimated redshifts Zi are expected to show a strong 
peak over the range \z\ — zi\ Az, where Az is the typical uncertainty in 
photometric redshifts. It is this peak that allows one to identify and subtract 
the intrinsic signal from the correlation functions. An efficient method to cal- 
culate the covariance of the redshift-dependent correlation functions has been 
developed by Simon et al. (2004), where the improvement in the constraints 
on cosmological parameters from redshift information has been studied, con- 
firming the earlier results by Hu (1999) which were based on considerations 
of the power spectrum. 

Brown et al. (2002) obtained a measurement of the intrinsic ellipticity 
correlation from the Super-COSMOS photographic plate data, where the 
galaxies are at too low a redshift for cosmic shear playing any role. Heymans 
et al. (2003) used the COMBO-17 data set (that will be described in Sect. 7.3 
below) for which accurate photometric redshifts are available to measure 
the intrinsic alignment. The results from both studies is that the models 
predicting a large intrinsic amplitude can safely be ruled out. Nevertheless, 
intrinsic alignment affects cosmic shear measurements, at about the 2% level 
for a survey with the depth of the VIRMOS-DESCART survey, and somewhat 
more for the slightly shallower COMBO-17 survey. Hence, to obtain precision 
measurements of cosmic shear, very important for constraining the equation 
of state of dark energy, these physically close pairs of galaxies need to be 
identified in the survey, making accurate photometric redshifts mendatory. 

Correlation between intrinsic ellipticity and shear. The relation (126) 
above implicitly assumes that the shear is uncorrelated with the intrinsic 
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shape of a neighboring galaxy. However, as pointed out by Hirata & Seljak 
(2004), this is not necessarily the case. Hence consider galaxies at two signif- 
icantly different redshifts Zi < zj. For them, the first term in (126) vanishes. 
However, making use of e = e( s ) + 7, one finds 

< e4 6*) = ( e f ) 7 ;) + eV cns , (127) 

where the first term on the right-hand side describes the correlation between 
the intrinsic ellipticity of the lower-redshift galaxy with the shear along the 
l.o. s. to the higher-redshift one. The correlation can in principle be non-zero: 
if the intrinsic alignment of the light of a galaxy is determined by the large- 
scale tidal gravitational field, then this tidal field at the redshift z% causes 
both, an alignment of the nearer galaxy and a contribution to the shear of 
the more distant one (see Fig. 38). This alignment effect can therefore not be 
removed by considering only pairs at different redshifts. 



5>0 /"\ 5>0 




Fig. 38. A tidal gravitational field, for example caused by two matter concentra- 
tions, can produce an alignment of a galaxy situated at the same redshift (indicated 
by the solid ellipse), as well as contributing to the shear towards a more distant 
galaxy (as indicated by the dashed ellipse) (from Hirata & Seljak 2004) 

The importance of this effect depends on the nature of the alignment 
of galaxies relative to an external tidal field. If the alignment is linear in 
the tidal field strength, then this effect can be a serious contaminant of 
the cosmic shear signal, in particular for relatively shallow surveys (where 
the mean source redshift is small); in particular, this effect can yield much 
larger contaminations than the intrinsic alignment given by the first term 
in (126). As can be seen from Fig. 38, the resulting contribution is negative, 
hence decreases the lensing signal. If, however, the intrinsic alignment de- 
pends quadratically on the tidal field, as is suggested by tidal torque theory, 
than this effect is negligible. Whether or not this effect is relevant needs to 
be checked from observations. Assuming that the matter density field is rep- 
resented approximately by the galaxy distribution, the latter can be used 
to estimate the tidal gravitational field, in particular its direction. Alterna- 
tively, since the correlation between the intrinsic alignment and the shear 
towards more distant galaxies has a different redshift dependence than the 
lensing shear signal, these two contributions can be disentangled from the 
z-dependence of the signal. 
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It should be noted that the use of photometric redshifts also permits to 
study the cosmic shear measures as a function of source redshift; hence, one 
can probe various redshift projections P K (£) of the underlying power spec- 
trum Ps(k; z) separately. This is due to the fact that the cosmic shear signal 
from different populations of galaxies (i.e., with different redshift distribu- 
tions) lead to different weight functions g(w) [see (94)], and thus to different 
weighting in the projection (99) of the power spectrum. Not surprisingly, 
uncertainties of cosmological parameters are thereby reduced (Hu 1999; Si- 
mon et al. 2004). Also, as shown by Taylor (2001), Hu & Keeton (2002) and 
Bacon & Taylor (2003), in principle the three-dimensional mass distribution 
S(x) can be reconstructed if the redshifts of the source galaxies are known 
(see Sect. 7.6). 

6.6 Predictions; ray-tracing simulations 

The power spectrum of the convergence P K can be calculated from the power 
spectrum of the cosmological matter distribution Ps, using (99); the latter in 
turn is determined by the cosmological model. However, since the non-linear 
evolution of the power spectrum is essential for making accurate quantitative 
predictions for the shear properties, there is no analytic method known how 
to calculate the necessary non- linear Pg. As was mentioned in Sect. 6.1 of 
IN, fairly accurate fitting formulae exist which yield a closed-form expression 
for Ps and which can be used to obtain P K (see, e.g., Jain & Seljak 1997). 
Nevertheless, there are a number of reasons why this purely analytic approach 
should at least be supplemented by numerical simulations. 

• First, the fitting formulae for Ps (Peacock & Dodds 1996; Smith et al. 
2003) have of course only a finite accuracy, and are likely to be insufficient 
for comparison with results from the ongoing cosmic shear surveys which 
are expected to yield very accurate measurements, owing to their large 
solid angle. 

• A second reason why simulations are needed is to test whether the vari- 
ous approximations that enter the foregoing analytical treatment are in 
fact accurate enough. To recall them, we employed the Born approxima- 
tion, i.e., neglected terms of higher order than linear in the Newtonian 
potential when deriving the convergence, and we assumed that the shear 
everywhere is small, so that the difference between shear and reduced 
shear can be neglected, at least on average. This, however, is not guaran- 
teed: regions in the sky with large shear are most likely also those regions 
where the convergence is particularly large, and therefore, there one ex- 
pects a correlation between 7 and k, which can affect the dispersion of 
9 = 7/(1 - «)• 

• Third, whereas fairly accurate fitting formulae exist for the power spec- 
trum, this is not the case for higher-order statistical properties of the 
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matter distribution; hence, when considering higher-order shear statis- 
tics (Sect. 9), numerical simulations will most likely be the only way to 
obtain accurate predictions. 
• The covariance of the shear correlations (and all other second-order shear 
measures) depends on fourth-order statistics of the shear field, for which 
hardly any useful analytical approximations are available. The analyti- 
cal covariance estimates are all based on the Gaussian assumption for 
the fourth-order correlators. Therefore, simulations are invaluable for the 
calculation of these covariances, which can be derived for arbitrary survey 
geometries. 

Ray-tracing simulations: The principle. The simulations proceed by 
following light rays through the inhomogeneous matter distribution in the 
Universe. The latter is generated by cosmological simulations of structure 
evolution. Those start at an early epoch by generating a realization of a 
Gaussian random field with a power spectrum according to the cosmological 
model considered, and follow the evolution of the density and velocity field 
of the matter using Newtonian gravity in an expanding Universe. The mass 
distribution is represented by discrete particles whose evolution in time is 
followed. A finite volume of the Universe is simulated this way, typically a 
box of comoving side-length L, for which periodic boundary conditions are 
applied. This allows one to use Fast Fourier Transforms (FFT) to evaluate 
the gravitational potential and forces from the density distribution. The box 
size L should be chosen such that the box contains a representative part of 
the real Universe, and must therefore be larger than the largest scales on 
which structure is expected, according to the power spectrum; a reasonable 
choice is L <; 100ft- -1 Mpc. The number of grid points and the number of par- 
ticles that can be distributed in this volume is limited by computer memory; 
modern simulations work typically with 256 3 points and the same number 
of particles, though larger simulations have also been carried out; this im- 
mediately yields the size of grid cells, of order 0.5ft. -1 Mpc. This comoving 
length, if located at a rcdshift of z ~ 0.3 (which is about the most relevant for 
cosmic shear), subtends an angle of roughly 2' on the sky. The finite number 
of particles yields the mass resolution of the simulations, which is typically 
~ lO lo /i -1 M0, depending on cosmological parameters. 

In order to obtain higher spatial resolution, force calculations are split 
up into near-field and far-field forces. The gravitational force due to the dis- 
tant matter distribution is obtained by grid-based FFT methods, whereas 
the force from nearby masses is calculated from summing up the forces of 
individual particles; such simulations yield considerably higher resolution of 
the resulting mass distribution. Since the matter in these simulations is rep- 
resented by massive particles, these can undergo strong interactions, leading 
to (unphysical) large orbital deflections. In order to avoid these unphysical 
strong collisions, the force between pairs of particles is modified at short dis- 
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tances, typically comparable to the mean separation of two particles in the 
simulation. This softening length defines the minimum length scale on which 
the results from numerical simulations can be considered reliable. Cosmolog- 
ical simulations consider either the dark matter only or, more recently, the 
hydrodynamics effects of baryons have been incorporated as well. 

The outcome of such simulations, as far as they are relevant here, are the 
3-dimensional positions of the matter particles at different (output) times 
or redshifts. In order to study the light propagation through this simulated 
mass distribution, one employs multiple lens-plane theory. First, the volume 
between us and sources at some redshift z s is filled with boxes from the 
cosmological simulations. That is, the comoving distance w s — w(z s ) is split 
up into n intervals of length L, and the mass distribution at an output time 
close to ti = t(w = (i — 1/2)L) is considered to be placed at this distance. 
In this way, one has a light cone covered by cubes containing representative 
matter distributions. Since the mass distributions at the different times U are 
not independent of each other, but one is an evolved version of the earlier one, 
the resulting mass distribution is highly correlated over distances much larger 
than L. This can be avoided by making use of the statistical homogeneity and 
isotropy of the mass distribution: each box can be translated by an arbitrary 
two-dimensional vector, employing the periodicity of the mass distribution, 
and rotated by an arbitrary angle; furthermore, the three different projections 
of the box can be used for its orientation. In this way - a kind of recycling 
of numerical results - the worst correlations are removed. 

Alternatively, one can combine the outputs from several simulations with 
different realizations of the initial conditions. In this case, one can use simu- 
lation boxes of different spatial extent, to match the comoving size of a big 
light cone as a function of redshift. That is, for a given light-cone size, only 
relatively small boxes are needed at low redshifts, and bigger ones at higher 
redshift (see White & Hu 2000). 

Second, the mass in each of these boxes is projected along the line-of- 
sight, yielding a surface mass density at the appropriate comoving distance 
Wi = (i — 1/2) L. Each of these surface mass densities can now be considered 
a lens plane, and the propagation of light can be followed from one lens plane 
to the next; the corresponding theory was worked out in detail by Blandford 
& Narayan (1986; see also Chap. 9 of SEF), but applied as early as 1970 by 
Refsdal (1970) for a cosmological model consisting of point masses only (see 
also Schneider & Weiss 1988a, b). Important to note is that the surface mass 
density S in each lens plane is the projection of Ap = p — p of a box, so 
that for each lens plane, (S) = 0. As has been shown in Seitz et al. (1994), 
this multiple lens-plane approach presents a well-defined discretization of the 
full 3-dimcnsional propagation equations. Light bundles arc deflected and 
distorted in each lens plane and thus represented as piecewise straight rays. 
The resulting Jacobi matrix A is then obtained as a sum of products of the 
tidal matrices in the individual lens planes, yielding a discrctized version of 
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the form (88) for A. The result of such simulations is then the matrix A{6) 
on a predefined angular grid, as well as the positions (3(9) in the source plane. 
The latter will not be needed here, but have been used in studies of multiple 
images caused by the LSS (see Wambsganss et al. 1998). 

One needs special care in applying the foregoing prescription; in particu- 
lar, in the smoothing process to obtain a mass distribution from the discrete 
particles; Jain et al. (2000) contains a detailed discussion on these issues. 10 
The finite spatial resolution in the simulations translates into a redshift- 
dependent angular resolution, which degrades for the low redshift lens planes; 
on the other hand, those have a small impact on the light propagation due 
to the large value of S cr for them [see eq. (10) of IN]. The discreteness of 
particles gives rise to a shot-noise term in the mass distribution, yielding 
increased power on small angular scales. 

Results from ray-tracing simulations. We shall summarize here some of 
the results from ray-tracing simulation: 

• Whereas the Jacobi matrix in this multi-deflection situation is no longer 
symmetric, the contribution from the asymmetry is very small. The power 
spectrum of the asymmetric part of A is at least three orders of magni- 
tude smaller than the power spectrum P K , for sources at z s = 1 (Jain 
et al. 2000). This result is in accord with analytical expectations (e.g., 
Bcrnardeau et al. 1997; Schneider et al. 1998a), i.e., that terms quadratic 
in the Newtonian potential are considerably smaller than first-order terms, 
and supports the validity of the Born approximation. Furthermore, this 
result suggests that a simpler method for predicting cosmic shear distri- 
butions from numerical simulations may be legitimate, namely to project 
the mass distribution of all lens planes along the grid of angular positions, 
with the respective weighting factors, according to (92), i.e., employing 
the Born approximation. Of course this simplified method is computa- 
tionally much faster than the full ray-tracing. 

• The power spectra obtained reproduce the ones derived using (99), over 
the range of wavevectors which are only mildly affected by resolution and 
discreteness effects. This provides an additional check on the accuracy of 
the fitting formulae for the non-linear power spectrum. 

• The simulation results give the full two-dimensional shear map, and thus 
can be used to study properties other than the second-order ones, e.g., 
higher-order statistics, or the occurrence of circular shear patters indicat- 
ing the presence of strong mass concentrations. An example of such maps 
is shown in Fig. 26. These shear maps can be used to simulate real sur- 
veys, e.g, including the holes in the data field resulting from masking or 

10 For other recent ray-tracing simulations related to cosmic shear, see e.g. Barber 
et al. (2000); Hamana & Mellier (2001); Premadi et al. (2001); Taruya et al. 
(2002); Fluke et al. (2002); Barber (2002); Vale & White (2003). 
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complicated survey geometries, and thus to determine the accuracy with 
which the power spectra can be determined from such surveys. Note that 
in order to quantify the error (or covariance matrix) of any second-order 
statistics, one needs to know the fourth-order statistics, which in gen- 
eral cannot be obtained analytically when outside the linear (Gaussian) 
regime. Simulations are also used to obtain good survey strategies. 

Higher-order correction terms. Up to now we have considered the lowest- 
order approximation of the Jacobi matrix (88) and have argued that this 
provides a sufficiently accurate description. Higher-order terms in <P were 
neglected since we argued that, because the Newtonian potential is very small, 
these should play no important role. However, this argument is not fully 
correct since, whereas the potential certainly is small, its derivatives are not 
necessarily so. Of course, proper ray-tracing simulation take these higher- 
order terms automatically into account. 

We can consider the terms quadratic in <P when expanding (88) to higher 
order. There are two such terms, one containing the product of second-order 
derivatives of <£, the other a product of first derivatives of <P and its third 
derivatives. The former is due to lens- lens coupling: The shear and surface 
mass densities from different redshifts (or lens planes, in the discretized ap- 
proximation) do not simply add, but multi lens plane theory shows that the 
tidal matices from different lens planes get multiplied. The latter term comes 
from dropping the Born approximation and couples the deflection of a light 
ray (first derivative of <P) with the change of the tidal matrix with regards to 
the position (third derivatives of <P). These terms are explitly given in the ap- 
pendix of Schneider et al. (1998a), in Bernardeau et al. (1997) and in Cooray 

6 Hu (2002) and found to be indeed small, providing corrections of at most 
a few percent. Furthermore, Hamana (2001) has shown that the magnifica- 
tion bias caused by the foreground matter inhomogeneities on the selection 
of background galaxies has no practical effect on second-order cosmic shear 
statistics. 

Another effect that affects the power spectrum P K is the difference be- 
tween shear and reduced shear, the latter being the observable. Since the 
correlation function of the reduced shear is the correlation function of the 
shear plus a term containing a product of two shears and one surface mass 
density, this correction depends linearly on the third-order statistical proper- 
ties of the projected mass k. Also this correction turns out to be very small; 
moreover, it does not give rise to any B-mode contribution (Schneider et al. 
2002a). 

7 Large-scale structure lensing: results 

After the theory of cosmic shear was considered in some detail in the previous 
section, we shall summarize here the observational results that have been 
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obtained so far. In fact, as we will see, progress has been incredibly fast 
over the past ^four years, with the first detections reported in 2000, and 
much larger surveys being available by now, with even larger ones ongoing 
or planned. Already by now, cosmic shear is one of the pillars on which our 
cosmological model rests. 

The predictions discussed in the previous section have shown that the 
rms value of cosmic shear is of the order of ~ 2% on angular scales of ~ 
1', and considerably smaller on larger scales. These small values make the 
measurements of cosmic shear particularly challenging, as the observational 
and instrumental effects described in Sect. 3 are expected to be larger than the 
cosmic shear signal, and thus have to be understood and removed with great 
precision. For example, the PSF anisotropy of nearly all wide-field cameras 
is considerably larger than a few percent and thus needs to be corrected for. 
But, as also discussed in Sect. 3, methods have been developed and thoroughly 
tested which are able to do so. 

7.1 Early detections of cosmic shear 

Whereas the theory of cosmic shear was worked out in the early 1990's 
(Blandford et al. 1991; Miralda-Escude 1991; Kaiser 1992), it took until the 
year 2000 before this effect was first discovered. 11 The reason for this evo- 
lution must be seen by a combination of instrumental developments, i.e. the 
wide-field CCD mosaic cameras, and the image analysis software, like IMCAT 
(the software package encoding the KSB method discussed in Sect. 3.5), with 
which shapes of galaxies can be accurately corrected for PSF effects. Then 
in March 2000, four groups independently announced their first discoveries 
of cosmic shear (Bacon et al. 2000; Kaiser et al. 2000; van Waerbeke et al. 
2000, Wittman et al. 2000). In these surveys, of the order of 10 5 galaxy im- 
ages have been analyzed, covering about 1 deg 2 . Later that year, Maoli et 
al. (2001) reported a significant cosmic shear measurement from 50 widely 
separated FORSIQVLT images, each of size ~ 6'.5 x 6'.5, which also agreed 
with the earlier results. The fact that the results from four independent teams 
agreed within the respective error bars immediately lend credit to this new 
window of observational cosmology. This is also due to the fact that 4 differ- 
ent telescopes, 5 different cameras (the UH8K and CFH12K at CFHT, the 
8' x 16'-imager on WHT, the BTC at the 4m-CTIO telescope and FORS1 at 
the VLT), independent data reduction tools and at least two different image 
analysis methods have been used. These early results are displayed in Fig. 39, 
where the (equivalent) shear dispersion is plotted as a function of effective 
circular aperture radius, together with the predictions for several cosmolog- 
ical models. It is immediately clear that a high-normalization Einstcin-dc 

11 An early heroic attempt by Mould et al. (1994) to detect cosmic shear on a single 
~ 9' x 9' field only yielded an upper limit, and the putative detection of a shear 
signal by Schneider et al. (1998b; see also Fort et al. 1996) in three 2' x 2' fields 
is, due to the very small sky area, of no cosmological relevance. 
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Sitter model can already be excluded from these early results, but the other 
three models displayed are equally valid approximations to the data. 




Fig. 39. Shear dispersion as a function of equivalent circular aperture radius as 
obtained from the first five measurements of cosmic shear (MvWM+: Maoli et al. 
2001; vWME+: van Waerbeke et al. 2000; KWL: Kaiser, Wilson & Luppino 2000; 
BRE: Bacon, Refregier & Ellis 2000; WTK: Wittman et al. 2000). The data points 
within each team are not statistically independent, due to the fairly strong covari- 
ance of the shear dispersion on different angular scales, but points from different 
teams are independent (see text). The error bars contain the noise from the in- 
trinsic ellipticity dispersion and, for some of the groups, also an estimate of cosmic 
variance. The four curves are predictions from four cosmological models; the upper- 
most one corresponds to an Einstein-de Sitter Universe with normalization ag = 1, 
and can clearly be excluded by the data. The other three models are cluster nor- 
malized - see Sect. 4.4 of IN - and all provide equally good fits to these early data 
(courtesy: Y. Mellier) 



Maoli et al. (2001) considered the constraints one obtains by combining 
the results from these five surveys, in terms of the normalization parameter 
(is of the power spectrum. The confidence contours in the J? m — ers-plane are 
shown in Fig. 40. There is clearly a degeneracy between these two parameters 
from the data sets considered, roughly tracing as ~ 0.59J?" 47 ; although the 
best fitting model is defined by Q m = 0.26, as = 1.1, it cannot be significantly 
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distinguished from, e.g., a i? m = 1, <r$ — 0.62 model since the error bars 
displayed in Fig. 39 are too large and the range of angular scales over which 
the shear was measured is too small. In Fig. 40, the solid curve displays the 
normalization as obtained from the abundance of massive clusters, which is 
seen to follow pretty much the valley of degeneracy from the cosmic shear 
analysis. This fact should not come as a surprise, since the cluster abundance 
probes the power spectrum on a comoving scale of about 8/i _1 Mpc, which is 
comparable to the median scale probed by the cosmic shear measurements. 
However, the predictions of the cluster abundance rely on the assumption that 
the initial density field was Gaussian, whereas the cosmic shear prediction is 
independent of this assumption, which therefore can be tested by comparing 
the results from both methods. 
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Fig. 40. Constraints on i? m and 
as from the five surveys shown in 
Fig. 39; shown are 1, 2 and 3-cr 
confidence regions. The cross de- 
notes the best-fitting model, but 
as can be seen, these two param- 
eters are highly degenerate with 
the data used. The solid curve 
displays the constraint from clus- 
ter normalization (from Maoli et 
al. 2001) 
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7.2 Integrity of the results 

As mentioned before, the cosmic shear effects are smaller than many ob- 
servational effects (like an anisotropic PSF) that could mimic a shear; it is 
therefore necessary to exclude as much as possible such systematics from the 
data. The early results described above were therefore accompanied by quite 
a large number of tests; they should be applied to all cosmic shear surveys 
as a sanity check. A few of those shall be mentioned here. 



Stellar ellipticity fits. The cllipticity of stellar objects should be well fit- 
ted by a low-order function, so one is able to predict the PSF anisotropy 
at galaxy locations. After subtracting this low-order fit from the measured 
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stellar ellipticities, there should be no coherent spatial structure remaining, 
and the cllipticity dispersion of the corrected ellipticities should be consider- 
ably smaller that the original ones, essentially compatible with measurement 
noise. 

Correlation of PSF anisotropy with corrected galaxy ellipticities. 

After correcting for the anisotropy of the PSF, there should remain no cor- 
relation between the corrected galaxy ellipticities and the ellipticity of the 
PSF. This correlation can be measured by considering (ee*), where e is the 
corrected galaxy ellipticity (31), and e* the uncorrected stellar ellipticity (i.e., 
the PSF anisotropy). Bacon et al. (2000) found that for fairly low signal-to- 
noise galaxy images, this correlation was significantly different from zero, but 
for galaxies with high S/N (only those entered their cosmic shear analysis), 
no significant correlation remained. The same was found in van Waerbeke et 
al. (2000), except that the average (ei) was slightly negative, but independent 
of e^. The level of (ei) was much smaller than the estimated cosmic shear, 
and does not affect the latter by more than 10%. 

Spatial dependence of mean galaxy ellipticity. When a cosmic shear 
survey consist of many uncorrelated fields, the mean galaxy ellipticity at a 
given position on the CCD chips should be zero, due to the assumed statistical 
isotropy of the shear field. If, on the other hand, the shear averaged over many 
fields shows a dependence on the chip position, most likely optical distortions 
and/or PSF effects have not been properly accounted for. 

Parity invariance. The two-point correlation function £ x (#) = (7t7x) (#) 
is expected to vanish for a density distribution that is parity symmetric. 
More generally, every astrophysical cause for a 'shear' signal (such as intrin- 
sic galaxy alignments, or higher-order lensing effects) is expected to be in- 
variant under parity transformation. A significant cross-correlation £ x would 
therefore indicate systematic effects in the observations and/or data analysis. 

7.3 Recent cosmic shear surveys 

Relatively soon after the announcement of the first cosmic shear detections, 
additional results were published. These newer surveys can roughly be classi- 
fied as follows: deep surveys, shallower, but much wider surveys, and special 
surveys, such as obtained with the Hubble Space Telescope. We shall mention 
examples of each of these classes here, without providing a complete list. 

Deep surveys. Currently the largest of the deep surveys from which cosmic 
shear results have been published is the VIRMOS-DESCART survey, carried 
out with the CFH12K camera at the CFHT; this camera covers about 45' x 30' 
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Fig. 41. The shear dispersion as a function of aperture radius (left) and the shear 
correlation function £+(#) (right) as measured from the VIRMOS-DESCART sur- 
vey (van Waerbeke et al. 2001). The lower panel on the right shows an enlarge- 
ment with logarithmic axis of the larger figure. The error bars were calculated 
from simulations in which the galaxy images have been randomized in orientation. 
The curves show predictions from three different cosmological models, correspond- 
ing to (i? m , Qa, (Ts) = (0.3, 0, 0.9) (open model, short-dashed curves), (0.3, 0.7, 0.9) 
(low-density flat model, solid curves), and (1,0,0.6) (Einstein-de Sitter Universe, 
long-dashed curves). In all cases, the shape parameter of the power spectrum was 
set to inspect = 0.21. The redshift distribution of the sources was assumed to follow 
the law (128), with a = 2, /3 = 1.5 and zo = 0.8, corresponding to a mean redshift 
of z « 1.2 



in one exposure. The exposure time of the images, taken in the I-band, is 
one hour. The survey covers four fields of 2° x 2° each, of which roughly 
8.5 deg 2 have been used for a weak lensing analysis up to now (van Waerbeke 
et al. 2001, 2002). About 20% of the area is masked out, to account for 
diffraction spikes, image defects, bright and large foreground objects etc. The 
number density of galaxy images used for the cosmic shear analysis is about 
17arcmin -2 . A small part of this survey was used for the early cosmic shear 
detection (van Waerbeke et al. 2000). Compared to the earlier results, the 
error bars on the shear measurements are greatly reduced, owing to the much 
better statistics. We show in Fig. 41 the shear dispersion and the correlation 
function as measured from this survey. Furthermore, this survey yielded the 
first detection of a significant ^M 2 p )-signal; we shall come back to this later. 
In order to compare the measured shear signal with cosmological predictions, 
one needs to assume a redshift distribution for the galaxies; a frequently used 
parameterization for this is 



p(z) = N - 



exp 



(128) 
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where a and f3 determine the shape of the redshift distribution, zq the charac- 
teristic redshift, and TV is a normalization factor, chosen such as J dzp(z) = 1. 

Another example of a deep survey is the Suprime-Cam survey (Hamana 
et al. 2003), a 2.1 deg 2 survey taken with the wide-field camera Suprime- 
Cam (with a 34' x 27' field-of-view) at the 8.2-m Subaru telescope. With an 
exposure time of 30 min, the data is considerably deeper than the VIRMOS- 
DESCART survey, due to the much larger aperture of the telescope. After 
cuts in the object catalog, the resulting number density of objects used for 
the weak lensing analysis is ~ 30arcmin~ 2 . Fig. 42 shows how small the PSF 
anisotropy is, and that the correction with a fifth-order polynomial over the 
whole field-of-view in fact reduces the remaining stellar ellipticities consider- 
ably. This survey has detected a significant cosmic shear signal, as measured 
by the shear correlation functions and the aperture mass dispersion, over an- 
gular scales 2' 9 ^ 40'. The shear signal increases as fainter galaxies are 
used in the analysis, as expected, since fainter galaxies are expected to be at 
larger mean redshift and thus show a stronger shear signal. 
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Fig. 42. Stellar ellipticities be- 
fore and after correction for PSF 
anisotropies in the Suprime-Cam 
survey. Numbers give mean and 
dispersion of stellar ellipticities 
\x\ (from Hamana et al. 2003) 



Bacon et al. (2003) combine images taken at the Keck II telescope and the 
WHT. For the former, 173 fields were used, each having a f.o.v. of 2' x 8'; and 
the data from WHT were obtained from 20 different fields, covering about 
1 deg 2 in total. The large number of fields minimizes the sample variance of 
this particular survey, and the two instruments used allowed a cross-check of 
instrumental systematics. 

Very wide surveys. Within a given observing time, instead of mapping a 
sky region to fairly deep magnitudes, one can also map larger regions with 
smaller exposure time; since most of the surveys have been carried out with 
goals in addition to cosmic shear, the survey strategy will depend on these 
other considerations. We shall mention two very wide surveys here. 

Hoekstra et al. (2002a; also Hoekstra et al. 2002b) used the Red Cluster 
Sequence (RCS) survey, a survey designed to obtain a large sample of galaxy 
clusters using color selection techniques (Gladders & Yee 2000). The cosmic 
shear analysis is based on 53 deg 2 of i?c-band data, spread over 13 patches on 



Weak Gravitational Lensing 119 



the sky and observed with two different instruments, the CFH12K@CFHT 
for Northern fields, and the Mosaic II camera at the CTIO 4m telescope in 
the South. The integration times are 900 s and 1200 s, respectively. The shear 
dispersion as measured with the two instruments are in satisfactory agree- 
ment and thus can be safely combined. Owing to the shallower magnitude, 
the detected shear is smaller than in the deeper surveys mentioned above: on 
a scale of 2.5arcmin, the shear dispersion is (|7| 2 ) ~ 4 x 10~ 5 in the RCS 
survey, compared to — 2 x 10~ 4 in the deeper VIRMOS-DESCART survey 
(see Fig. 41), in accordance with expectations. 

Jarvis et al. (2003) presented a cosmic shear survey of 75 deg 2 , taken with 
the BTC camera and the Mosaic II camera on the CTIO 4m telescope, with 
about half the data taken with each instrument. The survey covers 12 fields, 
each with sidelength of ~ 2.5°. For each pointing, three exposures of 5min 
were taken, making the depth of this survey comparable to the RCS. A total 
of <~ 2 x 10 6 galaxies with R < 23 were used for the shear analysis. Since this 
survey has some peculiar properties which are very educational, it will be 
discussed in somewhat more detail. The first point to notice is the large pixel 
size of the BTC, of 0'.'43 per pixel - for comparison, the CFH12K has ~ Of! 20 
per pixel. With a median seeing of 1'.'05, the PSF is slightly undersamplcd 
with the BTC. Second, the PSF anisotropy on the BTC is very large, as 
shown in Fig. 43 - a large fraction of the exposures has stellar images with 
cllipticities higher than 10%. Obviously, this renders the image analysis and 
the correction for PSF effects challenging. As shown on the right-hand part 
of Fig. 43, this challenge is indeed met. This fact is very nicely illustrated 
in Fig. 44, where the corrected stellar ellipticities are shown as a function of 
the PSF anisotropy; in essence, the correction reduces the PSF anisotropy by 
nearly a factor of 300! 

The third point to notice is that the image analysis for this survey has 
not been carried out with IMC AT (as for most of the other surveys), but 
by a different image analysis method described in Bernstein & Jarvis (2002). 
In this respect, this survey is independent of all the others described in this 
section; it is important to have more than one image analysis tool to check 
potential systematics of either one. 

One of the amazing results from the CTIO cosmic shear survey is that the 
shear dispersion can be measured with about a 3ct significance on each of the 
12 fields. Hence, this provides a shear dispersion measurement on scales larger 
than 1 degree (the radius of a circle with area of the mean area of the 12 fields 
of ~ 6.2 deg 2 ); the shear dispersion on these scales is (I7I 2 ) = 0.0012±0.0003. 

Special surveys. There are a number of cosmic shear surveys which cover 
a much smaller total area than the ones mentioned above, and are thus not 
competitive in terms of statistical accuracy, but which have some special 
properties which give them an important complementary role. One example 
are surveys carried out with the Hubble Space Telescope. Since for them the 
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Fig. 43. On the left-hand side, the raw ellipticities of stars are shown for the four 
CCDs of the BTC instrument; for reference, a 1% ellipticity is indicated. After 
correcting for the PSF anisotropy, the remaining stellar ellipticities (shown on the 
right) are of order 1-2%, and essentially uncorrelated with position on the chip, 
i.e., they are compatible with measurement noise (from Jarvis et al. 2003) 
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Fig. 44. These two plots show the two components of the stellar ellipticities as 
measured on the data (x-axis) and after correction, from the Jarvis et al. (2003) 
survey. The slope of the straight line is about 1/300, meaning that the strong 
PSF anisotropy can be corrected for up to this very small residual. The final PSF 
anisotropy is well below 5 x 10~ 4 . This figure, together with Fig. 43, demonstrates 
how well the procedures for PSF corrections work (from Jarvis et al. 2003) 
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PSF is much smaller than for ground-based observations, PSF corrections 
in measuring galaxy ellipticities are expected to be correspondingly smaller. 
The drawback of HST observations is that its cameras, at least before the 
installment of the ACS, have a small field-of-view, less than 1 arcmin 2 for 
the STIS CCD, and about 5 arcmin 2 for WFPC2. This implies that the total 
area covered by HST surveys are smaller than those achievable from the 
ground, and that the number of stars per field are very small, so that PSF 
measurements are typically not possible on those frames which are used for 
a cosmic shear analysis. Hence, the PSF needs to be measured on different 
frames, e.g., taken on star clusters, and one needs to assume (this assumption 
can be tested, of course) that the PSF is fairly stable in time. In fact, this is 
not really true, as the telescopes moves in and out the Earth's shadow every 
orbit, thereby changing its temperature and thus changing its length (an 
effect called breathing) . A further potential problem of HST observations is 
that the WFPC2 has a pixel scale of Of.'l and thus substantially undersamplcs 
the PSF; this is likely to be a serious problem for very faint objects whose 
size is not much larger than the PSF size. 

Cosmic shear surveys from two instruments onboard HST have been re- 
ported in the literature so far. One of the surveys uses archival data from the 
Medium Deep Survey, a mostly parallel survey carried out with the WFPC2. 
Refregier et al. (2002) used 271 WFPC2 pointings observed in the I-band, 
selected such that each of them is separated from the others by at least 
10' to have statistically independent fields. They detected a shear dispersion 
on the scale of the WFC-chips (which is equivalent to a scale 9 <~ 0'.72) of 
(ItI 2 ) ~ 3.5 x 10~ 4 , which is a 3.8a detection. The measurement accuracy is 
lower than that, owing to cosmic variance and uncertainties in the redshift 
distribution of the sources. Hammerle et al. (2002) used archival parallel data 
taken with STIS; from the 121 fields which are deep enough, have multiple 
exposures, and are at sufficiently high galactic latitude, they obtained a shear 
dispersion of (I7I 2 ) ~ 15 x 10~ 4 on an effective scale of ~ 30", a mere 1.5c 
detection. This low significance is due to the small total area covered by this 
survey. On the other hand, since the pixel scale of STIS is half of that of 
WFPC2, the undcrsampling problem is much less in this case. A larger set 
of STIS parallel observations were analyzed with respect to cosmic shear by 
Rhodes et al. (2004) and Miralles et al. (2003). Whereas Rhodes et al. ob- 
tained a significant (~ 5a) detection on an angular scale of ~ 30", Miralles 
ct al. concluded that the degradation of the STIS CCD in orbit regarding 
the charge transfer efficiency prevents a solid measurement of weak lensing. 
The discrepancies between these two works, which are based to a large de- 
gree on the same data set, is unclear at present. Personally I consider this 
discrepancy as a warning sign that weak lensing measurement based on small 
fields-of-view, and correspondingly too few stars to control the PSF on the 
science exposures, need to be regarded with extreme caution. 
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The new ACS onboard HST offers better prospects for cosmic shear mea- 
surements, since it has a substantially larger field-of-view. A first result was 
derived by Schrabback (2004), again based on parallel data. He found that 
the PSF is not stable in time, but that the anisotropy pattern changes among 
only a few characteristic patterns. He used those as templates, and the (typ- 
ically a dozen) stars in the science frames to select a linear combination of 
these templates for the PSF correction of individual frames, thereby obtaining 
a solid detection of cosmic shear from the early ACS data. 

A further survey that should be mentioned here is the one conducted 
on COMB017 fields (Brown et al. 2003). COMB017 is a one square degree 
survey, split over four fields, taken with the WFI at the ESO/MPG 2.2m tele- 
scope on La Silla, in 5 broad-band and 12 medium-band filters. In essence, 
therefore, this multi-band survey produces low-resolution spectra of the ob- 
jects and thus permits to determine very accurate photometric redshifts of 
the galaxies taken for the shear analysis. Therefore, for the analysis of Brown 
et al., the redshift distribution of the galaxies is assumed to be very well 
known and not a source of uncertainty in translating the cosmic shear mea- 
surement into a constraint on cosmological parameters. We shall return to 
this aspect in Sect. 7.6. The data set was reanalyzed by Heymans et al. (2004) 
where special care has been taken to identify and remove the signal coming 
from intrinsic alignment of galaxy shapes. 

7.4 Detection of B-modes 

The recent cosmic shear surveys have measured the aperture mass dispersion 
(Mf p (6»)), as well as its counterpart (M]_(6)) for the B-modes (see Sect. 6.5). 
These aperture measures are obtained in terms of the directly measured shear 
correlation functions, using the relations (125). As an example, we show in 
Fig. 45 the aperture measures as obtained from the Red Cluster Sequence 
survey (Hockstra et al. 2002a). A significant measurement of (Mf p (6>)) is 
obtained over quite a range of angular scales, with a peak around a few ar- 
cminutes, as predicted from CDM power spectra (see Fig. 35). In addition 
to that, however, a significant detection of (M^(0)) signifies the presence of 
B-modes. As discussed in Sect. 6.5, those cannot be due to cosmic shear. The 
only plausible explanation for them, apart from systematics in the observa- 
tions and data analysis, is an intrinsic alignment of galaxies. If this is the 
cause of the B-modes, then one would expect that the relative contribution 
of the B-mode signal decreases as higher-redshift galaxies are used for the 
shear measurement. In fact, this expectation is satisfied, as shown in Fig. 
45, where the galaxy sample is split into a bright and faint part, and the 
relative amplitude of the B-mode signal is smaller for the fainter (and thus 
presumably more distant) sample. 

Similar detections of a B-mode signal have been obtained by the other 
surveys. For example, van Waerbeke et al. (2001) reported a significant B- 
mode signal on angular scales of a few arcminutes. In the reanalysis of the 
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Fig. 45. The aperture mass dispersion (M^ p (Q)} (top panels) and the cross aperture 
dispersion (^M\(d)^ (bottom panels) from the RCS survey (Hoekstra et al. 2002a). 
In the left panels, all galaxies with apparent magnitude 20 < Rc < 24 are used, 
the middle and right panels show the same statistics for the brighter and fainter 
subsamples of background galaxies, respectively. Error bars in the former are larger, 
owing to the smaller number of bright galaxies 



VIRMOS-DESCART data, van Waerbeke et al. (2002) reported that the B- 
mode on these scales was caused by the polynomial PSF anisotropy fit: the 
third-order function (fitted for each chip individually) has its largest ampli- 
tude near the boundary of the chips and is least well constrained there, unless 
one finds stars close to these edges. If a second-order polynomial fit is used, 
the B-modes on a few arcminute scales disappear. Van Waerbeke et al. (2002) 
calculate the aperture statistics from the uncorrected stellar ellipticities in 
their survey and found that the 'E- and B-modes' of the PSF anisotropy 
have very similar amplitude and shape (as a function of 6*). This similarity 
is unlikely to change in the course of the PSF correction procedure. Thus, 
they argue, that if the B-mode is due to systematics in the data analysis, a 
systematic error of very similar amplitude will also affect the E-mode. Jarvis 
et al. (2003) found a significant B-mode signal on angular scales below ~ 30'; 
hence, despite their detection of an E-mode signal over a large range of an- 
gular scales 1' & 6 100', one suspects that part of this signal might be due 
to non-lcnsing effects. 

Given our lack of understanding about the origin of the B-mode signal, 
and the associated likelihood that any effect causing a B-mode signal also 
contributes a non-lensing part to the E-mode signal, one needs a prescrip- 
tion on how to use the detected E-modc signal for a cosmological analysis. 
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Depending on what one believes the B-modes are due to, this prescription 
varies. For example, if the B-mode is due to a residual systematic, one would 
add its signal in quadrature to the error bars of the E-mode signal, as done 
in van Waerbeke et al. (2002). On the other hand, if the B-mode signal is 
due to intrinsic alignments of galaxies, as is at least suggested for the RCS 
survey from Fig. 45 owing to its dependence on galaxy magnitudes, then it 
could be more reasonable to subtract the B-mode signal from the E-mode 
signal, if one assumes that intrinsic alignments produce similar amplitudes of 
both modes [which is far from clear, however; Mackey et al. (2002) find that 
the E-mode signal from intrinsic alignments is expected to be ~ 3.5 times 
higher than the corresponding B-mode signal]. 

Owing to the small size of the fields observed with the early HST instru- 
ments, no E/B-mode decomposition can be carried out from these surveys - 
the largest size of these fields is smaller than the angular scale at which the 
aperture mass dispersion is expected to peak (see Fig. 35). However, future 
cosmic shear studies carried out with ACS images will most likely be able to 
detect, or set upper bounds on the presence of B-modes. 

In fact, it is most likely that (most of) the B-mode signal seen in the 
cosmic shear surveys is due to remaining systematics. Hoekstra (2004) in- 
vestigated the PSF anisotropy of the CFH12k camera using fields with a 
high number density of stars. Randomly selecting about 100 stars per CCD, 
which is the typical number observed in high galactic latitute fields, he fitted 
a second-order polynomial to these stars representing the PSF anisotropy. 
Correcting with this model all the stars in the field, the remaining stellar cl- 
lipticities carry substantial E- and B-mode signals, essentially on all angular 
scales, but peaking at about the size of a CCD. A substantially smaller resid- 
ual is obtained if the ellipticities of stars in one of the fields is corrected by a 
more detailed model of the PSF anisotropy as measured from a different field; 
this improvement indicates that the PSF anisotropy pattern in the data set 
used by Hoekstra is fairly stable between different exposures. This, however, 
is not necessarily the case in other datasets. Nevertheless, if one assumes that 
the PSF anisotropy is a superposition of two effects, one from the properties 
of the telescope and instrument itself, the other from the specific observation 
procedure (e.g., tracking, wind shake, etc.), and further assuming that the 
latter one affects mainly the large-scale properties of the anisotropy pattern, 
then a superposition of a PSF model (obtained from a dense stellar field and 
describing the small-scale properties of the anisotropy pattern) plus a low- 
order polynomial can be a better representation of the PSF anisotropy. This 
indeed was verified in the tests made by Hoekstra (2004). In their reanalysis 
of the VIRMOS-DESCART survey, van Waerbeke et al. (2004) have fitted 
the PSF anisotropy with a rational function, instead of a polynomial. This 
functional form was suggested by the study of Hoekstra (2004). When cor- 
recting the galaxy ellipticities with this new PSF model, essentially no more 
B-modes in the VIRMOS-DESCART survey arc detected. Further studies on 
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PSF anisotropy corrections need to be conducted; possibly the optimal way 
of dealing with them will be instrument-specific. 

7.5 Cosmological constraints 

The measured cosmic shear signal can be translated into constraints on cos- 
mological parameters, by comparing the measurements with theoretical pre- 
dictions. In Sect. 6.4 we have outlined how such a comparison can be made; 
there, we have concentrated on the shear correlation functions as the primary 
observables. However, the detection of significant B-modes in the shear field 
makes the aperture measures the 'better' statistics to compare with predic- 
tions. They can be calculated from the shear correlation functions, as shown 
in (125). Calculating a likelihood function from the aperture mass dispersion 
proceeds in the same way as outlined in Sect. 6.4 for the correlation functions. 

We have argued in Sect. 6.3 that (Mf p (#)) provides very localized infor- 
mation about the power spectrum P K (£) and is thus a very useful statistic. 
One therefore might expect that the aperture mass dispersion as calculated 
from the shear correlation functions contains essentially all the second-order 
statistical information of the survey. This is not true, however; one needs to 
recall that the shear correlation function £ + is a low-pass filter of the power 
spectrum, and thus contains information of P K on angular scales larger than 
the survey size. This information is no longer contained in the aperture mass 
dispersion, owing to its localized associated filter. Therefore, in order to keep 
this long-range information in the comparison with theoretical predictions, it 
is useful to complement the estimates of (Mf p (0)) with either the shear dis- 
persion, or the correlation function £ + , at a scale which is not much smaller 
than the largest scale at which (M^ p (0)) is measured. Note, however, that 
this step implicitly assumes that on these large angular scales, the shear sig- 
nal is essentially free of B-mode contributions. If this assumption is not true, 
and cannot be justified from the survey data, then this additional constraint 
should probably be dropped. 

The various constraints on parameters that have been derived from the 
cosmic shear surveys differ in the amount of prior information that has been 
used. As an example, we consider the analysis of van Waerbeke et al. (2002). 
These authors have considered a model with four free parameters: J? m , the 
normalization cts, the shape parameter i^pect and the characteristic redshift 
z s (or, equivalently, mean redshift z s ) of their galaxy sample, assuming a flat 
Universe, i.e., Qa = 1 — ^m- They have used a flat prior for I^pcct and z s in a 
fairly wide interval over which they marginalized the likelihood function (see 
Fig. 46). Depending on the width of these intervals, the confidence regions 
are more or less wide. It should be noted that the confidence contours close 
if -Tspect and z s are assumed to be known (see van Waerbeke et al. 2001), but 
when these two parameters are kept free, fi m and as are degenerate. 

The right panel of Fig. 46 shows the corresponding constraints as ob- 
tained from the RCS survey. Since this survey is shallower and only extends 
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Fig. 46. Constraints on J? m and cr$ from two cosmic shear surveys. Left: The 
VIRMOS-DESCART survey (van Waerbeke et al. 2002). The grey-scale and dashed 
contours show the 68%, 95% and 99.9% confidence regions with a marginaliza- 
tion over the range -Tspect £ [0.05,0.7], and mean galaxy redshift in the range 
z s 6 [0.50, 1.34], whereas the solid contours show the same confidence regions with 
the stronger priors r spcct £ [0.1,0.4] and z B £ [0.8,1.1]. Right: The RCS survey 
(Hoekstra et al. 2002), showing the 1, 2, and 3<r confidence regions for a prior 
Aspect 6 [0.05, 0.5] and mean redshift z B £ [0.54, 0.66]. In both cases, a flat Universe 
has been assumed 

to magnitudes where spectroscopic surveys provide information on their red- 
shift distribution, the range of z s over which the likelihood is marginalized is 
smaller than for the VIRMOS-DESCART survey. Correspondingly, the con- 
fidence region is slightly smaller in the case. Even smaller confidence regions 
are obtained if external information is used: Hoekstra et al. (2002a) consid- 
ered Gaussian priors with f2 m + {2 A = 1.02±0.06, as follows from pre-WMAP 
CMB results, r spec t = 0.21 ± 0.03, as follows from the 2dF galaxy redshift 
survey, and z s = 0.59 ± 0.02, for which the width of the valley of maximum 
likelihood narrows considerably. Jarvis et al. (2003) used for their estimate of 
cosmological parameters the aperture mass dispersion at three angular scales 
plus the shear dispersion at 9 = 100', and they considered alternatively the 
E-mode signal, and the E-mode signal ± the B-mode signal, to arrive at con- 
straints on the f2 m -a$ parameter plane. Since the CTIO survey samples a 
larger angular scale than the other surveys (data at small angular scales are 
discarded owing to the large B-mode signal there), the results are much less 
sensitive to r svcc t] furthermore, for the same reason the Jarvis et al. results 
are much less sensitive to the fit of the non-linear power spectrum according 
to Peacock & Dodds (1996) which van Waerbeke et al (2002) found to be 
not accurate enough for some cosmological models. In fact, if instead of the 
Peacock & Dodds fitting formula, the fit by Smith et al (2003) is used to 
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describe the non-linear power spectrum, the resulting best estimate of cr 8 is 
decreased by 8% for the RCS survey (as quoted in Jarvis et al. 2003). 

For the RCS and the CTIO surveys, the covariance matrix was obtained 
from field-to-field variations, i.e., Covjj = ((d, — Hi)(dj — Hj)), where [n is 
the mean of the observable di (e.g., the aperture mass dispersion at a spe- 
cific angular scale) over the independent patches of the survey, and angular 
brackets denote the average over all independent patches. The estimate of the 
covariance matrix for the VIRMOS-DESCART survey is slightly different, as 
it has only four independent patches. 

To summarize the results from these surveys, each of them found that 
a combination of parameters of the form cr^fi^ is determined best from the 
data, with a <~ 0.55, where the exact value of a depends on the survey depth. 
If we consider the specific case of f2 m = 0.3 which is close to the concordance 
value that was recently confirmed by WMAP, then the VIRMOS-DESCART 
survey yields cr 8 = 0.94 ± 0.12, the RCS survey has cr 8 = 0.91±g;?|, which 
improves to cr 8 = 0.861^05 if the stronger (Gaussian) priors mentioned above 
are used, and the CTIO survey yields er 8 = 0.71^o' 16 , here as 2a limits. 
Whereas these results are marginally in mutual agreement, the CTIO value 
for erg is lower than the other two. The higher values are also supported by 
results from the WFPC2 survey by Refregier et al. (2002), who find cr 8 = 
0.94±0.17, Bacon et al. (2003) with a$ = 0.97±0.13, and the earlier surveys 
discussed in Sect. 7.1. The only survey supporting the low value of the CTIO 
survey is COMB017 (Brown et al. 2002; see also the reanalysis of this dataset 
by Heymans et al. 2004). Most likely, these remaining discrepancies will be 
clarified in the near future; see discussion below. It should also be noted that 
at least for some of the surveys, a large part of the uncertainty comes from the 
unknown redshift distribution of the galaxies; this situation will most likely 
improve, as efficient spectrographs with large multiplex capability become 
available at lOm-class telescopes, which will in the near future deliver large 
galaxy redshift surveys at very faint magnitudes. Those can be used to much 
better constrain the redshift distribution of the source galaxies in cosmic 
shear surveys. 

7.6 3-D lensing 

As mentioned several times before, using individual source redshift informa- 
tion, as will become available in future multi-color wide-field surveys, can 
improve the cosmological constraints obtained from weak lensing. In this sec- 
tion we shall therefore summarize some of the work that has been published 
on this so-called 3-D lensing. 

Three-dimensional matter distribution. Provided the redshifts of indi- 
vidual source galaxies are known (or estimated from their multiple colors), 
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one can derive the 3-D matter distribution, not only its projection. The princi- 
ple of this method can be most easily illustrated in the case of a flat Universe, 
for which the surface mass density k(6, w) for sources at comoving distance 
w becomes - see (93) 

V ' ; 2c 2 J w a(w') y ' 

Multiplying this expression by w and differentiating twice yields 

— {wK{ 0, w)) = —^ — 5(wO, W ), 

which therefore allows one to obtain the three-dimensional density contrast 
5 in terms of the surface mass densities k at different source redshifts. As 
we have seen in Sect. 5, there are several methods how to obtain the surface 
mass density from the observed shear. To illustrate the 3-D method, we use 
the finite- field reconstruction in the form of (60), for which one finds 

^ w) -3§^ a ^I ^ H( " 5 0r ^ [W ^ W)] ' (130) 

Taylor (2001) derived the foregoing result, but concentrated on the 3-D gravi- 
tational potential instead of the mass distribution, and Bacon & Taylor (2003) 
and Hu & Keeton (2003) discussed practical implementations of this relation. 
First to note is the notorious mass-sheet degeneracy, which in the present con- 
text implies that one can add an arbitrary function of w to the reconstructed 
density contrast S. This cannot be avoided, but if the data field is sufficiently 
large, so that averaged over it, the density contrast is expected to vanish, this 
becomes a lesser practical problem. For such large data fields, the above mass 
reconstruction can be substituted in favour of the simpler original Kaiser & 
Squires (1993) method. Still more freedom is present in the reconstruction of 
the gravitational potential. The second problem is one of smoothing: owing 
to the noisiness of the observed shear field, the ^-differentiation (as well as 
the ^-differentiation present in the construction of the vector field u 7 ) needs 
to be carried out on the smoothed shear field. A discretization of the observed 
shear field, as also suggested by the finite accuracy of photometric redshifts, 
can be optimized with respect to this smoothing (Hu & Keeton 2003). 

A first application of this methods was presented in Taylor et al. (2004) 
on one of the COMBO 17 fields which contains the supercluster A90 1/902. 
The clusters present clearly show up also in the 3-D mass map, as well as a 
massive structure behind the cluster A902 at higher redshift. Already earlier, 
Wittman et al. (2001, 2003) estimated the redshifts of clusters found in their 
deep blank-field data by studying the dependence of the weak lensing signal 
on the estimated source redshifts, and subsequent spectroscopy showed that 
these estimates were fairly accurate. 
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Power spectrum estimates. A rcdshift-dependent shear field can also be 
used to improve on the cosmological constraints obtained from cosmic shear. 
Hu (1999) has pointed out that even crude information on the source redshifts 
can strongly reduce the uncertainties of cosmological parameters. In fact, the 
3-D power spectrum can be constructed from rcdshift-dependent shear data 
(see, e.g., Heavens 2003, Hu 2002, and references therein). For illustration 
purposes, one can use the n power spectrum for sources at fixed comoving 
distance w, which reads in a flat Universe - see (99) 

9H$ni r , {w-w'f (i a 

P K (£,w) = / dw P s [—,w . 131 

4c 4 J Q w z a z (w') \w' J 

Differentiating w 2 P K three times w.r.t. w then yields (Bacon et al. 2004) 

P ^ (fc ' w) = 9i^ a2H d^ [w2pK(wfc ' w)] • (132) 

In this way, one could obtain the three-dimensional power spectrum of the 
matter. However, this method is essentially useless, since it is both very noisy 
(due to the third-order derivatives) and throws away most of the informa- 
tion contained in the shear field, as it makes use only of shear correlations 
of galaxies having the same redshift, and not of all the pairs at different dis- 
tances. A much better approach to construct the three-dimensional power 
spectrum is given, e.g., by Pen et al. (2003). 

In my view, the best use of three-dimensional data is to construct the shear 
correlators £±(#; z\, Z2), as they contain all second-order statistical informa- 
tion in the data and at the same time allow the identification and removal 
of a signal from intrinsic shape correlations of galaxies (King & Schneider 
2003). From these correlation functions, one can calculate a x 2 function as in 
(116) and minimize it w.r.t. the wanted parameters. One problem of this ap- 
proach is the large size of the covariance matrix, which now has six arguments 
(two angular separations and four redshifts). However, as shown in Simon et 
al. (2004), it can be calculated fairly efficiently, provided one assumes that 
the fourth-order correlations factorize into products of two-point correlators, 
i.e., Gaussian fields (if this assumption is dropped, the covariance must be 
calculated from cosmological N-body simulations). 

Bacon et al. (2004) used the COMBO 17 data to derive the shape of the 
power spectrum, using the redshift dependent shear correlations. They pa- 
rameterize the power spectrum in the form P(k, z) oc Ak a c~ sz , so that it is 
described by an amplitude A, a local slope a and a growth parameter s which 
describes how the amplitude of the power spectrum declines towards higher 
redshifts. In fact, the slope a = —1.2 was fixed to the approximate value in 
/1CDM models over the relevant range of spatial scales and redshifts probed 
by the COMBO 17 data (since the data used cover only 1/2 deg 2 , reducing 
the number of free parameters by fixing a is useful). The evolution of the 
power spectrum is found with high significance in the data. Furthermore, the 
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authors show that the use of redshift information improves the accuracy in 
the determination of a% by a factor of two compared to the 2-D cosmic shear 
analysis of the same data (Brown et al. 2003). 

The main application of future multi-waveband cosmic shear surveys will 
be to derive constraints on the equation of state of dark energy, as besides 
lensing there are only a few methods available to probe it, most noticibly the 
magnitude-redshift relation of SN la. Since dark energy starts to dominate the 
expansion of the Universe only at relatively low redshifts, little information 
about its properties is obtainable from the CMB anisotropies alone. For that 
reason, quite a number of workers have considered the constraints on the 
dark energy equation of state that can be derived from future cosmic shear 
surveys (e.g., Huterer 2002; Hu 2002; Munshi & Wang 2003; Hu & Jain 
2003; Abazajian & Dodclson 2003; Benabed & van Waerbeke 2003; Song & 
Knox 2003). The results of these are very encouraging; the sensitivity on 
the dark energy properties is due to its influence on structure growth. With 
(photometric) redshift information on the source galaxies, the evolution of 
the dark matter distribution can be studied by weak lensing, as shown above. 
Van Waerbeke & Mellier (2003) have compared the expected accuracy of the 
cosmic shear result from the ongoing CFHT Legacy Survey with the variation 
of various dark energy models and shown that the CFHTLS will be able to 
discriminate between some of these models, with even much better prospects 
from future space-based wide-field imaging surveys (e.g., Hu & Jain 2003). 

7.7 Discussion 

The previous sections have shown that cosmic shear research has matured; 
several groups have successfully presented their results, which is important in 
view of the fact that the effects one wants to observe are small, influenced by 
various effects, and therefore, independent results from different instruments, 
groups, and data analysis techniques are essential in this research. We have 
also seen that the results from the various groups tend to agree with each 
other, with a few very interesting discrepancies remaining whose resolution 
will most likely teach us even more about the accuracies of data analysis 
procedures. 

Lessons for cosmology. A natural question to ask is, what has cosmic 
shear taught us so far about cosmology? The most important constraint 
coming from the available cosmic shear results is that on the normalization 
as, for which only few other accurate methods are available. We have seen 
that cosmic shear prefers a value of ag 0.8 — 0.9, which is slightly larger 
than current estimates from the abundance of clusters, but very much in 
agreement with the measurement of WMAP. The estimate from the cluster 
abundance is, however, not without difficulties, since it involves several scal- 
ing relations which need to be accurately calibrated; hence, different authors 
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arrive at different values for as (see, e.g., Pierpaoli, Scott & White 2001; Sel- 
jak 2002; Schuecker et al. 2003). The accuracy with which as is determined 
from CMB data alone is comparable to that of cosmic shear estimates; as 
shown in Spergel et al. (2003), more accurate values of as are obtained only 
if the CMB measurements are combined with measurements on smaller spa- 
tial scales, such as from galaxy redshift surveys and the Lyman alpha forest 
statistics. Thus, the cr 8 -determination from cosmic shear is certainly compet- 
itive with other measurements. Arguably, cosmic shear sticks out in this set 
of smaller-scale constraints due to the fewer physical assumptions needed for 
its interpretation. 

But more importantly, it provides a fully independent method to mea- 
sure cosmological parameters. Hence, at present the largest role of the cosmic 
shear results is that it provides an independent approach to determining these 
parameters; agreement with those obtained from the CMB, galaxy redshift 
surveys and other methods are thus foremost of interest in that they pro- 
vide additional evidence for the self-consistency of our cosmological model 
which, taken at face value, is a pretty implausible one: we should always 
keep in mind that we are claiming that our Universe consists of 4.5% normal 
(baryonic) matter, with the rest being shared with stuff that we have given 
names to ('dark matter', 'dark energy'), but are pretty ignorant about what 
that actually is. Insofar, cosmic shear plays an essential role in shaping our 
cosmological view, and has become one of the pillars on which our standard 
model rests. 

Agreement, or discrepancies? How to clarify the remaining discrepancies 
that were mentioned before - what are they due to? One needs to step back 
for a second and be amazed that these results are in fact so well in agreement 
as they are, given all the technical problems a cosmic shear survey has to 
face (see Sect. 3). Nevertheless, more investigations concerning the accuracy 
of the results need to be carried out, e.g., to study the influence of the different 
schemes for PSF corrections on the final results. For this reason, it would be 
very valuable if the same data set is analyzed by two independent groups 
and to compare the results in detail. Such comparative studies may be a 
prerequisite for the future when much larger surveys will turn cosmic shear 
into a tool for precision cosmology. 

Joint constraints from CMB anisotropics and cosmic shear. As men- 
tioned before, the full power of the CMB anisotropy measurements is achieved 
when these results are combined with constraints on smaller spatial scales. 
The tightest constraints from WMAP are obtained when it is combined with 
results from galaxy redshift surveys and the statistics of the Lya forest ab- 
sorption lines (Spergel et al. 2003). Instead of the latter, one can instead 
use results from cosmic shear, as it provides a cleaner probe of the statis- 
tical properties of the matter distribution in the Universe. As was pointed 
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out before (e.g., Hu & Tegmark 1999; see Fig. 34), the combination of CMB 
measurements with cosmic shear results is particularly powerful to break de- 
generacies that are left from using the former alone. Contaldi et al. (2003) 
used the CMB anisotropy results from WMAP (Bennett et al. 2003), supple- 
mented by anisotropy measurements on smaller angular scales from ground- 
based experiments, and combined them with the cosmic shear aperture mass 
dispersion from the RCS survey (Hoekstra et al. 2002a). As is shown in Fig. 
47, the constraints in the ,!? m -(78-parameter plane are nearly mutually orthog- 
onal for the CMB and cosmic shear, so that the combined confidence region 
is substantially smaller than each of the individual regions. 
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Fig. 47. The confidence region in 
the J2 m -<T8-plane obtained from 
the two-dimensional marginalized 
likelihood. Shown are the 68% 
and 95% confidence regions de- 
rived individually from the CMB 
and the RCS cosmic shear survey, 
as well as those obtained by com- 
bining both constraints (Contaldi 
et al. 2003) 



Wide vs. deep surveys. In designing future cosmic shear surveys, the 
survey strategy needs to decide the effective exposure time. For a given total 
observing time (the most important practical constraint), one needs to find 
a compromise between depth and area. Several issues need to be considered 
in this respect: 

• The lensing signal increases with redshift, and therefore with increasing 
depth of a survey; it should therefore be easier to detect a lensing signal 
in deep surveys. Furthermore, by splitting the galaxy sample into sub- 
samples according to the magnitude (and/or colors), one can study the 
dependence of the lensing signal on the mean source redshift, which is 
an important probe of the evolution of the matter power spectrum, and 
thus of cosmology. If one wants to probe the (dark) matter distribution 
at appreciable redshifts (z ~ 0.5), one needs to carry out deep surveys. 

• A wider survey is more likely to probe the linear part of the power spec- 
trum which is more securely predicted from cosmological models than the 
non-linear part; on the other hand, measurement of the latter, when com- 
pared with precise models (e.g., from numerical simulations), can probe 
the non-linear gravitational clustering regime. 
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• Depending on the intrinsic galaxy alignment, one would prefer deeper 
surveys, since the relative importance of the intrinsic signal decreases with 
increasing survey depth. Very shallow surveys may in fact be strongly 
affected by the intrinsic signal (e.g., Heymans & Heavens 2003). On the 
other hand, for precision measurements, as will become available in the 
near future, one needs to account for the intrinsic signal in any case, 
using redshift information (at least in a statistical sense), and so shallow 
surveys lose this potential disadvantage. In fact, the redshift estimates of 
shallower surveys are easier to obtain than for deeper ones. 

• In this context, one needs to compromise between area and and the num- 
ber of Afters in which exposures should be taken. Smaller area means 
worse statistics, e.g., larger effects of cosmic variance, but this has to 
be balanced against the additional redshift information. Also, if a fixed 
observing time is used, one needs to account for the weather, seeing and 
sky brightness distribution. One should then device a strategy that the 
best seeing periods are used to obtain images in the filter which is used 
for shape measurements, and bright time shall be spent on the longest 
wavelength bands. 

• Fainter galaxies are smaller, and thus more strongly affected by the point- 
spread function. One therefore expects that PSF corrections are on aver- 
age smaller for a shallow survey than for a deeper one. In addition, the 
separation between stars and galaxies is easier for brighter (hence, larger) 
objects. 

The relative weight of these arguments is still to be decided. Whereas some 
of the issues could be clarified with theoretical investigations (i.e., in order 
to obtain the tightest constraints on cosmological parameters, what is the 
optimal choice of area and exposure time, with their product being fixed), 
others (like the importance of intrinsic alignments) still remain unclear. Since 
big imaging surveys will be conducted with a broad range of scientific ap- 
plications in mind, this choice will also depend on those additional science 
goals. 

Future surveys. We are currently witnessing the installment of square- 
degree cameras at some of the best sites, among them Megacam at the CFHT, 
and OmcgaCAM at the newly built VLT Survey Telescope (the 2.6m VST) 
on Paranal (I present here European-biased prospects, as I am most familiar 
with these projects). Weak lensing, and in particular cosmic shear has been 
one of the science drivers for these instruments, and large surveys will be 
carried out with them. Already ongoing is the CFHT Legacy Survey, which 
will consist of three parts; the most interesting one in the current context is 
a <~ 160 deg 2 survey with an exposure time of <~ 1 h in each of five optical 
filters. This survey will therefore yield a more than ten-fold increase over 
the current VIRMOS-DESCART survey, with corresponding reductions of 
the statistical and cosmic variance errors on measurements. The multi-color 
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nature of this survey implies that one can obtain photometric redshift esti- 
mates at least for a part of the galaxies which will enable the suppression 
of the potential contribution to the shear signal from intrinsic alignments of 
galaxies. A forecast of the expected accuracy of cosmological parameter es- 
timates from the CFHTLS combined with the WMAP CMB measurements 
has been obtained by Tereno et al. (2004). It is expected that a substantial 
fraction of the VST observing time will be spend on multi-band wide-field 
surveys which, if properly designed, will be extremely useful for cosmic shear 
research. In order to complement results from the CFHTLS, accounting for 
the fact that the VST has smaller aperture than the CFHT (2.6m vs. 3.6m), 
a somewhat shallower but wider-field survey would be most reasonable. For 
both of these surveys, complementary near-IR data will become available af- 
ter about 2007, with the WirCam instrument on CFHT, and the newly build 
VISTA 4m-telescope equipped with a wide-field near-IR camera on Paranal, 
which will yield much better photometric redshift estimates than the opti- 
cal data alone. Furthermore, with the PanStarrs project, a novel method for 
wide-field imaging and a great leap forward in the data access rate will be 
achieved. 

Towards the end of the decade, a new generation of cosmic shear surveys 
may be started; there are two projects currently under debate which would 
provide a giant leap forward in terms of survey area and/or depth. One is a 
satellite project, SNAP/JDEM, originally designed for finding and follow-up 
of high-rcdshift supernovae to study the expansion history of the Universe 
and in particular to learn about the equation of state of the dark energy. With 
its large CCD array and multi-band imaging, SNAP will also be a wonderful 
instrument for cosmic shear research, yielding photometric redshift estimates 
for the faint background galaxies, and it is expected that the observing time 
of this satellite mission will be split between these two probes of dark energy. 
The other project under discussion is the LSST, a 8m telescope equipped 
with a ~ 9 deg 2 camera; such an instrument, with an efficiency larger than 
a factor 40 over Megacam@CFHT, would allow huge cosmic shear surveys, 
easily obtaining a multi-band survey over all extragalactic sky (modulo the 
constraints from the hemisphere). Since studying the equation of state of 
dark energy will be done most effectively with good photometric redshifts 
of source galaxies, the space experiment may appear more promising, given 
the fact that near-IR photometry is needed for a reliable redshift estimate, 
and sufficiently deep near-IR observations over a significant area of sky is not 
possible from the ground. 

8 The mass of, and associated with galaxies 
8.1 Introduction 

Whereas galaxies are not massive enough to show a weak lensing signal in- 
dividually (see eq. 19), the signal of many galaxies can be superposed statis- 
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tically. Therefore, if one considers sets of foreground (lens) and background 
galaxies, then in the mean, in a foreground-background galaxy pair, the image 
ellipticity of the background galaxy will be preferentially oriented in the di- 
rection tangent to the line connecting foreground and background galaxy. The 
amplitude of this tangential alignment then yields a mean lensing strength 
that depends on the redshift distributions of foreground and background 
galaxies, and on the mass distribution of the former population. This ef- 
fect is called galaxy-galaxy lensing and will be described in Sect. 8.2 below; 
it measures the mass properties of galaxies, provided the lensing signal is 
dominated by the galaxies themselves. This will not be the case for larger 
angular separations between foreground and background galaxies, since then 
the mass distribution in which the foreground galaxies are embedded (e.g., 
their host groups or clusters) starts to contribute significantly to the shear 
signal. The interpretation of this signal then becomes more difficult. On even 
larger scales, the foreground galaxies contribute negligibly to the lens signal; 
a spatial correlation between the lens strength and the foreground galaxy 
population then reveals the correlation between light (galaxies) and mass in 
the Universe. This correlated distribution of galaxies with respect to the un- 
derlying (dark) matter in the Universe - often called the bias of galaxies - 
can be studied with weak lensing, as we shall describe in Sect. 8.3 by using 
the shear signal, and in Sect. 8.4 employing the magnification effect. It should 
be pointed out here that our lack of knowledge about the relation between 
the spatial distribution of galaxies and that of the underlying (dark) matter 
is one of the major problems that hampers the quantitative interpretation of 
galaxy redshift surveys; hence, these lensing studies can provide highly valu- 
able input into the conclusions drawn from these redshift surveys regarding 
the statistical properties of the mass distribution in the Universe. 

8.2 Galaxy-galaxy lensing 

The average mass profile of galaxies. Probing the mass distribution 
of galaxies usually proceeds with dynamical studies of luminous tracers. The 
best-known method is the determination of the rotation curves of spiral galax- 
ies, measuring the rotational velocity of stars and gas as a function of distance 
from the galaxy's center (see Sofue & Rubin 2001 for a recent review). This 
then yields the mass profile of the galaxy, i.e. M(< r) cx v^ ot (r) r. For ellipti- 
cal galaxies, the dynamics of stars (like velocity dispersions and higher-order 
moments of their velocity distribution, as a function of r) is analyzed to 
obtain their mass profiles; as the kinematics of stars in ellipticals is more 
complicated than in spirals, their mass profiles are more difficult to measure 
(e.g., Gerhard et al. 2001). In both cases, these dynamical methods provided 
unambiguous evidence for the presence of a dark matter halo in which the 
luminous galaxy is embedded; e.g., the rotation curves of spirals are flat out 
to the most distant point where they can be measured. The lack of stars or 
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gas prevents the measurement of the mass profile to radii beyond the lumi- 
nous extent of galaxies, that is beyond ~ 10/i _1 kpc. Other luminous tracers 
that have been employed to study galaxy masses at larger radii include glob- 
ular clusters that are found at large galacto-centric radii (Cote et al. 2003), 
planetary nebulae, and satellite galaxies. Determining the relative radial ve- 
locity distribution of the latter with respect to their suspected host galaxy 
leads to estimates of the dark matter halo out to distances of ~ 100/i -1 kpc. 
These studies (e.g., Zaritsky et al. 1997) have shown that the dark matter 
halo extends out to at least these distances. 

One of the open questions regarding the dark matter profile of galaxies is 
the spatial extent of the halos. The dynamical studies mentioned above are all 
compatible with the mass profile following approximately an isothermal law 
(p oc r~ 2 ), which has to be truncated at a finite radius to yield a finite total 
mass. Over the limited range in radii, the isothermal profile cannot easily 
be distinguished from an NFW mass profile (see IN, Sect. 6.2), for which 
measurements at larger distances are needed (the mass distribution in the 
central parts of galaxies is affected by the baryons and thus not expected to 
follow the NFW profile; see Sect. 7 of SL). 

Weak gravitational lensing provides a possibility to study the mass pro- 
files of galaxies at still larger radii. Light bundles from distant background 
galaxies provide the 'dynamical tracers' that cannot be found physically as- 
sociated with the galaxies. Light bundles get distorted in such a way that on 
average, images of background sources arc oriented tangent to the transverse 
direction connecting foreground (lens) and background (source) galaxy. The 
first attempt to detect such a galaxy-galaxy lensing signal was reported in 
Tyson et al. (1984), but the use of photographic plates and the relatively 
poor seeing prevented a detection. Brainerd et al. (1996) presented the first 
detection and analysis of galaxy-galaxy lensing. Since then, quite a number 
of surveys have measured this effect, some of them using millions of galaxies. 

Strategy. Consider pairs of fore- and background galaxies, with separation 
in a given angular separation bin. The expected lensing signal is seen as a 
statistical tangential alignment of background galaxy images with respect to 
foreground galaxies. For example, if <j> is the angle between the major axis of 
the background galaxy and the connecting line, values ir/4 < <j> < ir/2 should 
be slightly more frequent than < (f> < ir/A (sec Fig. 48). Using the fact that 
the intrinsic orientations of background galaxies are distributed isotropically, 
one can show (Brainerd et al. 1996) that 



where <fi € [0, 7r/2] and j t is the mean tangential shear in the angular bin 
chosen. Thus, the amplitude of the cos- wave yields the (average) strength of 
the shear. 




(133) 
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Fig. 48. The probability distri- 
bution p(4>) of the angle <f> be- 
tween the major axis of the back- 
ground galaxy image and the con- 
necting line to the foreground 
galaxy is plotted for the sam- 
ple of Brainerd et al. (1996), to- 
gether with the best fit accord- 
ing to (133). The galaxy pairs 
have separation 5" < A6 < 34", 
and are foreground-background 
selected by their apparent mag- 
nitudes. 

The mean tangential ellipticity (et(#)) °f background galaxies relative 
to the direction towards foreground galaxies measures the mean tangential 
shear at separation 6. Since the signal is averaged over many foreground- 
background pairs, it measures the average mass profiles of the foreground 
galaxies. For sufficiently large samples of galaxies, the lens sample can be 
split into several subsamples, e.g., according to their color and/or morphology 
(early-type vs. late-type galaxies), or, if redshift estimates are available, they 
can be binned according to their luminosity. Then, the mass properties can 
be derived for each of the subsamples. 

The distinction between foreground and background galaxies is ideally 
performed using redshift information. This is indeed the case for the galaxy- 
galaxy lensing studies based on the Sloan Digital Sky Survey, for which early 
results have been reported by McKay et al. (2001); all lens galaxies used there 
have spectroscopic redshifts, whereas the source galaxies are substantially 
fainter than the lens galaxies so that they can be considered as a background 
population. For other surveys, the lack of redshift information requires the 
separation of galaxies to be based solely on their apparent magnitudes: fainter 
galaxies are on average at larger distances than brighter ones. However, the 
resulting samples of 'foreground' and 'background' galaxies will have (often 
substantial) overlap in redshift, which needs to be accounted for statistically 
in the quantitative analysis of these surveys. 

Quantitative analysis. The measurement of the galaxy-galaxy lensing sig- 
nal provides the tangential shear as a function of pair separation, 7t(#). With- 
out information about the redshifts of individual galaxies, the separation of 
galaxies into a 'foreground' and 'background' population has to be based on 
apparent magnitudes only. In the ideal case of a huge number of foreground 
galaxies, one could investigate the mass properties of 'equal' galaxies, by 
finely binning them according to redshift, luminosity, color, morphology etc. 
However, in the real world such a fine binning has not yet been possible, and 
therefore, to convert the lensing signal into physical parameters of the lens, a 
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parameterization of the lens population is needed. We shall outline here how 
such an analysis is performed. 

The first ingredient is the redshift probability distribution p(z\m) of galax- 
ies with apparent magnitude to which is assumed to be known from redshift 
surveys (and/or their extrapolation to fainter magnitudes). This probability 
density depends on the apparent magnitude to, with a broader distribution 
and larger mean redshift expected for fainter to. Since the distribution of 
'foreground' and 'background' galaxies in redshift is known for a given survey, 
the probabilities p(z\m) can be employed to calculate the value of Dd s /D s , 
averaged over all foreground-background pairs (with this ratio being set to 
zero if z s < z^). For given physical parameters of the lenses, the shear signal 
is proportional to this mean distance ratio. 

The mass profiles of galaxies are parameterized according to their luminos- 
ity. For example, a popular parameterization is that of a truncated isothermal 
sphere, where the parameters are the line-of-sight velocity dispersion a (or 
the equivalent circular velocity V c = V2cr) and a truncation radius s at which 
the p oc r~ 2 isothermal density profile turns into a steeper p oc r~ 4 law. The 
velocity dispersion is certainly dependent on the luminosity, as follows from 
the Tully-Fisher and Faber- Jackson relations for late- and early-type galax- 
ies, respectively. One therefore assumes the scaling a = c* (L/L*f/ 2 , where 

is a fiducial luminosity (and which conveniently can be chosen close to the 
characteristic luminosity of the Schechter luminosity function). Furthermore, 
the truncation scale s is assumed to follow the scaling s = s*(L/L*) v . The 
total mass of a galaxy then is M oc a 2 s 7 or M = M*(L/L*)P +V . 

Suppose to and z were given; then, the luminosity of galaxy would be 
known, and for given values of the parameters ct*, s*, (3 and 77, the mass 
properties of the lens galaxy would be determined. However, since z is not 
known, but only its probability distribution, only the probability distribution 
of the lens luminosities, and therefore the mass properties, are known. One 
could in principle determine the expected shear signal 7t(#) for a given sur- 
vey by calculating the shear signal for a given set of redshifts Zi for all lens 
and source galaxies, and then averaging this signal over the Zi using the red- 
shift probability distribution p(zi\m,i). However, this very-high dimensional 
integration cannot be performed; instead, one uses a Monte-Carlo integra- 
tion method (Schneider & Rix 1997): Given the positions 6i and magnitudes 
TOj of the galaxies, one can draw for each of them a redshift according to 
p(zi\rrii), and then calculate the shear at all positions Oi corresponding to a 
source galaxy, for each set of parameters <r*, s*, [3 and r\. This procedure can 
be repeated several times, yielding the expected shear (7$) and its dispersion 
a 7j i for each source galaxy's position. One can then calculate the likelihood 
function 

* - n i A ^ «*> (-KM 1 ) > (134) 
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where a e is the intrinsic ellipticity dispersion of the galaxies. C depends on 
the parameters of the model, and can be maximized with respect to them, 
thereby yielding estimates of er*, s*, /3 and rj. 

First detection The galaxy-galaxy lensing effect was first found by Brainerd 
et al. (1996), on a single 9'.6 x 9'.6 field. They considered 'foreground' galaxies 
in the magnitude range m G [20,23], and 'background' galaxies with m G 
[23,24]; this yielded 439 foreground and 506 background galaxies, and 3202 
pairs with AO G [5", 34"]. 12 For these pairs, the distribution of the alignment 
angle <p is plotted in Fig. 48. This distribution clearly is incompatible with 
the absence of a lens signal (at the 99.9% confidence level), and thus provides 
a solid detection. 

They analyzed the lens signal 7t(#) in a way similar to the method out- 
lined above, except that their Monte-Carlo simulations also randomized the 
positions of galaxies. The resulting likelihood yields c* ~ 160+gokm/s (90% 
confidence interval), whereas for s* only a lower limit of 25ft.~ 1 kpc (la) is 
obtained; the small field size, in combination with the relative insensitivity 
of the lensing signal to s* once this value is larger than the mean transverse 
separation of lensing galaxies, prohibited the detection of an upper bound on 
the halo size. 

Galaxy-galaxy lensing from the Red-Sequence Cluster Survey (RCS) 

Several groups have published results of their galaxy-galaxy lensing surveys 
since its first detection. Here we shall describe the results of a recent wide-field 
imaging survey, the RCS; this survey was already described in the context of 
cosmic shear in Sect. 7.3. 45.5 square degrees of single-band imaging data were 
used (Hoekstra et al. 2004). Choosing lens galaxies with 19.5 < Rq < 21, and 
source galaxies having 21.5 < Rc < 24 yielded ~ 1.2 x 10 5 lenses with me- 
dian redshift of 0.35 and <~ 1.5 x 10 6 sources with median redshift of ~ 0.53, 
yielding (Dd s /D s ) = 0.29 ±0.01 for the full sample of lenses and sources. Fig. 
49 shows the shear signal for this survey. 

The lens signal is affected by galaxies counted as lenses, but which in 
fact are in the foreground. As long as they are not physically associated with 
lens galaxies, this effect is accounted for in the analysis, i.e., in the value of 
(Dds/D s }. However, if fainter galaxies cluster around lens galaxies, this pro- 
duces an additional effect. Provided the orientation of the associated faint 
galaxies are random with respect to the separation vector to their bright 
neighbor, these physical pairs just yield a dilution of the shear signal. The 
amplitude of this effect can be determined from the angular correlation func- 
tion of bright and faint galaxies, and easily corrected for. Once this has been 

12 The lower angular scale has been chosen to avoid overlapping isophotes of fore- 
ground and background galaxies, whereas the upper limit was selected since it 
gave the largest signal-to-noise for the deviation of the angular distribution shown 
in Fig. 48 from a uniform one. 
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Fig. 49. (a) Tangential shear as a 
function of angular separation, ob- 
tained from the RCS survey; the 
shear signal is detected out to nearly 
one degree scale, (b) Cross shear 
signal, which is expected to vanish 
identically in the absence of system- 
atic effects on the ellipticity measure- 
ments. As can be seen, the cross sig- 
nal in indeed compatible with zero. 
The inset expands the scale, to bet- 
ter show the error bars (from Hoek- 
stra et al. 2003) 



done, the corrected shear signal within 10" < 9 < 2' has been fitted with 
an SIS model, yielding a mean velocity dispersion of the lens galaxies of 
\J (a 2 ) = 128 ± 4km/s. If the scaling relations between galaxy luminosity 
and velocity dispersion as described above is employed, with — 0.6, the 
result is er* = 140 ± 4km/s for L* = 10 10 /i~ 2 -L Q in the blue passband. 

To interpret the shear results on larger angular scales, the SIS model 
no longer suffices, and different mass models need to be employed. Using a 
truncated isothermal model, the best-fitting values of the scaling parameters 
= 0.60 ± 0.11 and rj = 0.24±°; 22 are obtained, when marginalizing over all 
other parameters. Furthermore, cr» = 137 ± 5km/s, in very close agreement 
with the results from small and the SIS model; this is expected, since most of 
the signal comes from these smaller separations. Most interesting, the analysis 
also yields an estimate of the truncation scale of s* = (185 ± 30)/i _1 kpc, 
providing one of only a few estimates of the scale of the dark matter halo. 
Hockstra et al. also performed the analysis in the frame of an NFW mass 
model. 

These results can then be used to calculate the mass-to-light ratio of 
an L* galaxy and, using the scaling, of the galaxy population as a whole. 
Considering only galaxies with M > 10 10 /i _1 M Q , the mean mass-to-light 
ratio inside the virial radius of galaxy halos is about 100 in solar units. 

The shape of dark matter halos. In the mass models considered before, 
the mass distribution of galaxies was assumed to be axi-symmetric. In fact, 
this assumption is not crucial, since the relation between shear and surface 
mass density, 7t(i9) = — k(z9) is true for a general mass distribution, 
provided 7 t and k($) are interpreted as the mean tangential shear and mean 
surface mass density on a circle of radius and k(i9) as the mean surface 
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mass density inside this circle (see eq. 24). However, deviations from axial 
symmetry are imprinted on the shear signal and can in principle be mea- 
sured. If the mass distribution is 'elliptical', the shear along the major axis 
(at given distance $) is larger than that along the minor axis, and therefore, 
an investigation of the strength of the shear signal relative to the orienta- 
tion of the galaxy can reveal a finite cllipticity of the mass distribution. For 
that, it is necessary that the orientation of the mass distribution is (at least 
approximately) known. Provided the orientation of the mass distribution fol- 
lows approximately the orientation of the luminous part of galaxies, one can 
analyze the direction dependence of the shear relative to the major axis of the 
light distribution (Natarajan & Rcfregicr 2000). Hockstra et al. (2002b) have 
used the RCS to search for such a direction dependence; they parameterized 
the lenses with a truncated isothermal profile with ellipticity e maS s = .Might, 
where / is a free parameter. The result / — 0.77 ± 0.2 indicates first that the 
mass distribution of galaxies is not round (which would be the case for / = 0, 
which is incompatible with the data), and second, that the mass distribution 
is rounder than that of the light distribution, since / < 1. However, it must 
be kept in mind that the assumption of equal orientation between light and 
mass is crucial for the interpretation of /; misalignment causes a decrease of 
/. Note that numerical simulations of galaxy evolution predict such a mis- 
alignment between total mass and baryons, with an rms deviation of around 
20° (van den Bosch ct al. 2002). Given the above result on /, it is therefore 
not excluded that the flattening of halos is very similar to that of the light. 
Also note that this result yields a value averaged over all galaxies; since the 
lens efficiency of elliptical galaxies (at given luminosity) is larger than that 
of spirals, the value of / is dominated by the contributions from early-type 
galaxies. 

Results from the Sloan Survey. The Sloan Digital Sky Survey (e.g., York 
et al. 2000) will map a quarter of the sky in five photometric bands, and 
obtain spectra of about one million galaxies. A large fraction of the data has 
already been taken by SDSS, and parts of this data have already been released 
(Abazajian et al. 2004). The huge amount of photometric data in principle 
is ideal for weak lensing studies, as it beats down statistical uncertainties to 
an unprecedented low level. However, the site of the telescope, the relatively 
large pixel size of 0'.'4, the relatively shallow exposures of about one minute 
and the drift-scan mode in which data are taken (yielding excellent flat- 
fielding, and thus photometric properties, somewhat at the expense of the 
shape of the PSF) render the data less useful for, e.g., cosmic shear studies: 
the small mean redshift of the galaxies yields a very small expectation value 
of the cosmic shear, which can easily be mimicked by residuals from PSF 
corrections. However, galaxy-galaxy lensing is much less sensitive to larger- 
scale PSF problems, since the component of the shear used in the analysis 
is not attached to pixel directions, but to neighboring galaxies, and thus 
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varies rapidly with sky position. Another way of expressing this fact is that 
the galaxy-galaxy lensing signal would remain unchanged if a uniform shear 
would be added to the data; therefore, SDSS provides an great opportunity 
for studying the mass profile of galaxies. 

Fischer et al. (2000) reported the first results from the SDSS, and a larger 
fraction of the SDSS data was subsequently used in a galaxy-galaxy lensing 
study by McKay et al. (2001), where also the spectroscopic rcdshifts of the 
lens galaxies were used. Their sample consists of <~ 31000 lens galaxies with 
measured redshifts, and ~ 3.6 x 10 6 source galaxies selected in the bright- 
ness range 18 < r < 22. For this magnitude range, the redshift distribution 
of galaxies is fairly well known, leaving little calibration uncertainty in the 
interpretation of the shear signal. In particular, there is very little overlap in 
the redshift distribution of source and lens galaxies. The data set has been 
subjected to a large number of tests, to reveal systematics; e.g., null results 
are obtained when the source galaxies are rotated by 45° (or, cquivalcntly, 
if -fx is used instead of 7t), or if the lens galaxies are replaced by an equal 
number of randomly distributed points relative to which the tangential shear 
component is measured. Since the redshifts of the lens galaxies are known, 
the shear can be measured directly in physical units, so one can determine 

AS+ = £{< R) - E{R) (135) 

in M Q /pc 2 as a function of R in kpc. 
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Fig. 50. The galaxy-galaxy lensing sig- 
nal from the SDSS plotted against phys- 
ical radius R. The lens sample has 
been subdivided into early- and late- type 
galaxies (upper panel), and in galax- 
ies situated in dense environments vs. 
those with a smaller neighboring galaxy 
density (lower panel). The figure clearly 
shows that the lensing signal is domi- 
nated by elliptical galaxies, and by those 
located in dense environment. Owing 
to the morphology-density relation of 
galaxies, these two results are not mutu- 
ally independent. Note that the lensing 
signal can be measured out to Ih^ 1 Mpc, 
considerably larger than the expected 
size of galaxy halos; therefore, the shear 
at these large separations is most likely 
caused by the larger-scale mass distribu- 
tion in which the galaxies are embedded 
(from McKay et al. 2001) 
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Fig. 50 shows the lensing result from McKay et al. (2001), where the 
lens sample has been split according to the type of galaxy (early vs. late 
type) and according to the local spatial number density of galaxies, which is 
known owing to the spectroscopic redshifts. The fact that most of the signal 
on small scales is due to ellipticals is expected, as they are more massive at 
given luminosity than spirals. The large spatial extent of the shear signal for 
ellipticals relative to that of spirals can be interpreted either by ellipticals 
having a larger halo than spirals, or that ellipticals arc preferentially found 
in high-density environments, which contribute to the lens signal on large 
scales. This latter interpretation is supported by the lower panel in Fig. 50 
which shows that the signal on large scales is entirely due to lens galaxies in 
dense environments. This then implies that the galaxy-galaxy lensing signal 
on large scales no longer measures the density profile of individual galaxies, 
but gets more and more dominated by group and cluster halos in which these 
(predominantly early-type) galaxies are embedded. 

A separation of these contributions from the data themselves is not possi- 
ble at present, but can be achieved in the frame of a theoretical model. Guzik 
& Seljak (2001) employed the halo model for the distribution of matter in 
the universe (see Cooray & Sheth 2002) to perform this separation. There, 
the galaxy-galaxy lensing signal either comes from matter in the same halo 
in which the galaxy is embedded, or due to other halos which are physi- 
cally associated (i.e., clustered) with the former. This latter contribution is 
negligible on the scales below ~ lft -1 Mpc on which the SDSS obtained a 
measurement. The former contribution can be split further into two terms: 
the first is from the dark matter around the galaxies themselves, whereas the 
second is due to the matter in groups and clusters to which the galaxies might 
belong. The relative amplitude of these two terms depends on the fraction of 
galaxies which are located in groups and clusters; the larger this fraction, the 
more important are larger-scale halos for the shear signal. Guzik & Seljak 
estimate from the radial dependence of the SDSS signal that about 20% of 
galaxies reside in groups and clusters; on scales larger than about 200/i _1 kpc 
their contribution dominates. The virial mass of an early-type galaxy is 
estimated to be M.2qo{L*) = (9.3 ± 2.2) x 10 11 ft.~ 1 M©, and about a factor 
of three smaller for late-type galaxies (with luminosity measured in a red 
passband; the differences are substantially larger for bluer passbands, owing 
to the sensitivity of the luminosity to star formation activity in late types). 
From the mass-to-light ratio in red passbands, Guzik & Seljak estimate that 
an galaxy converts about 10-15% of its virial mass into stars. Since this 
fraction is close to the baryon fraction in the universe, they conclude that 
most of the baryons of an L» galaxy are transformed into stars. For more 
massive halos, the mass-to-light ratio increases (M/L oc L 0A±0 - 2 ), and there- 
fore their conversion of baryons into stars is smaller - in agreement with what 
we argued about clusters, where most of the baryons are present in the form 
of a hot intracluster gas. 
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Yang et al. (2003) studied the cross-correlation between mass and galaxies 
using numerical simulations of structure formation and semi-analytic models 
of galaxy evolution. The observed dependence of the galaxy-galaxy lensing 
signal on galaxy luminosity morphological type and galaxy environment, as 
obtained by McKay et al. (2001), is well reproduced in these simulations. The 
galaxy-mass correlation is affected by satellite galaxies, i.e. galaxies not situ- 
ated at the center of their respective halo. Central galaxies can be selected by 
restricting the foreground galaxy sample to relatively isolated galaxies. The 
galaxy-galaxy lensing signal for such central galaxies can well be described 
by an NFW mass profile, whereas this no longer is true if all galaxies are con- 
sidered. Combining the measurement with the simulation, they find that an 
L*-galaxy typically resides in a halo with a virial mass of <~ 2 x 10 12 /i~ 1 M Q . 

With the SDSS progressing, larger datasets become available, allowing a 
more refined analysis of galaxy-galaxy lensing (Sheldon et al. 2004; Scljak et 
al. 2004). In the analysis of Seljak et al. (2004), more than 2.7 x 10 5 galaxies 
with spectroscopic redshifts have been used as foreground galaxies, and as 
background population those fainter galaxies for which photometric redshifts 
have been estimated. The resulting signal is shown in Fig. 51, for six different 
bins in (foreground) galaxy luminosity. 

In a further test to constrain systematic effects in the data, Hirata et 
al. (2004) have used spectroscopic and photometric redshifts to study the 
question whether an alignment of satellite galaxies around the lens galaxies 
can affect the galaxy-galaxy lensing signal from the SDSS; they obtain an 
upper limit of a 15% contamination. 

The SDSS already has yielded important information about the mass 
properties of galaxies; taken into account that only a part of the data of the 
complete survey have been used in the studies mentioned above, an analysis 
of the final survey will yield rich harvest when applied to a galaxy-galaxy 
lensing analysis. 

Lensing by galaxies in clusters. As an extension of the method presented 
hitherto, one might use galaxy-galaxy lensing also to specifically target the 
mass profile of galaxies in the inner part of clusters. One might expect that 
owing to tidal stripping, their dark matter halo has a considerably smaller 
spatial extent than that of the galaxy population as a whole. The study of 
this effect with lensing is more complicated than galaxy-galaxy lensing in 
the field, both observationally and from theory. Observationally, the data 
sets that can be used need to be taken in the inner part of massive clusters; 
since these are rare, a single wide-field image usually contains at most one 
such cluster. Furthermore, the number of massive galaxies projected near the 
center of a cluster is fairly small. Therefore, in order to obtain good statistics, 
the data of different clusters should be combined. Since the cores of clusters 
are optically bright, measuring the shape of faint background galaxies is more 
difficult than in a blank field. From the theoretical side, the lensing strength 




Fig. 51. The galaxy-galaxy lensing signal for six luminosity bins of foreground 
galaxies, as indicated by the absolute magnitude interval in each panel. The curves 
show a two-parameter model fitted to the data, based on the halo model, and the fit 
parameters are indicated: M is the virial mass of the halo (in units of 10 11 h~ x M©) 
in which the galaxies reside, and a is the fraction of the galaxies which are not 
central inside the halo, but satellite galaxies (from Seljak et al. 2004) 



of the cluster is much stronger than that of the individual cluster galaxies, 
and so this large-scale shear contribution needs to be accounted for in the 
galaxy-galaxy lensing analysis. 

Methods for performing this separation between cluster and galaxy shear 
were developed by Natarajan & Kneib (1997) and Geiger & Schneider (1998). 
Perhaps the simplest approach is provided by the aperture mass methods, 
applied to the individual cluster galaxies; there one measures the tangential 
shear inside an annulus around each cluster galaxy. This measure is insen- 
sitive to the shear contribution which is linear in the angular variable 6, 
which is a first local approximation to the larger-scale shear caused by the 
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cluster. Alternatively, a mass model of the (smoothed) cluster can be ob- 
tained, either from strong or weak lensing constraints, or preferentially both, 
and subtracted from the shear signal around galaxies to see their signal. 
However, once the mass fraction in the galaxies becomes considerable, this 
method starts to become biased. Geiger & Schneider (1999) have suggested 
to simultaneously perform a weak lensing mass reconstruction of the cluster 
and a determination of the parameters of a conveniently parameterized mass 
model of cluster galaxies (e.g., the truncated isothermal sphere); since the 
maximum likelihood method for the mass reconstruction (see Sect. 5.3) was 
used, the solution results from maximizing the likelihood with respect to the 
mass profile parameters (the deflection potential on a grid) and the galaxy 
mass parameters. 
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Fig. 52. Significance contours (solid) for galaxy properties obtained from galaxy- 
galaxy lensing of galaxies in the cluster CI 0939+4713. The parameters are the 
velocity dispersion cr, and the halo truncation radius s» of an L»-galaxy. Based on 
HST data (see Fig. 22), a simultaneous reconstruction of the cluster mass profile 
and the determination of the galaxy mass parameters was performed. No significant 
lensing signal is seen from the 55 late- type galaxies (lower panel), but a clear de- 
tection and upper bound to the halo size is detected for the 56 early-types. Dashed 
and dotted curves connect models with the same mass inside 8/i _1 kpc and total 
mass of an L*-galaxy, respectively (from Geiger & Schneier 1999) 



Natarajan et al. (1998), by analyzing HST data of the cluster AC114, 
concluded that the truncation radius of a fiducial L* galaxy in this cluster is ~ 
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15/i _1 kpc; similarly, Geiger & Schneider (1999) showed that the best-fitting 
truncation radius for early-type galaxies in the cluster A851 is ~ 10/i _1 kpc 
(see Fig. 52). Although the uncertainties are fairly large, these results indicate 
that indeed galaxies near cluster centers have a halo size considerably smaller 
than the average galaxy. The sample of clusters which can be investigated 
using this method will dramatically increase once the cluster sample observed 
with the new ACS camera onboard HST becomes available and gets properly 
analyzed. 



8.3 Galaxy biasing: shear method 

On small scales, galaxy-galaxy lensing measures the mass profile of galaxies, 
whereas on intermediate scales the environment of galaxies starts to dominate 
the shear signal. On even larger scale (say, beyond ~ lh^ 1 Mpc), the host halo 
contribution becomes negligible. Beyond that distance, any signal must come 
from the correlation of galaxy positions with the mass distribution in the 
Universe. This correlation, and the related issue of galaxy biasing (see Sect. 
6.1 of IN), can ideally be studied with weak lensing. In this section we shall 
outline how these quantities can be determined from shear measurements, and 
describe some recent results. As we shall see, this issue is intimately related 
to galaxy-galaxy lensing. The next section deals with the magnification of 
distant sources caused by mass overdensities correlated with galaxies and 
thereby causing an apparent correlation between high-redshift sources and 
low-redshift galaxies; the amplitude of this signal is again proportional to the 
correlation between galaxies and the underlying dark matter. 

An interesting illustration of the correlation between galaxies and mass 
has been derived by Wilson et al. (2001). They studied 6 fields with 30' x 
30' each, selected bright early-type galaxies from their V — I colors and / 
magnitudes and measured the shear from faint galaxies. Assuming that mass 
is strongly correlated with early-type galaxies, these can be used to predict 
the shear field, with an overall normalization given by the mean mass-to-light 
ratio of the early-type galaxies. This correlation has indeed been found, at 
the 5.2-ct significance level, and a value of M/L ~ 300/i in solar units has 
been obtained, assuming a flat low-density Universe. 



The galaxy-mass correlation and the bias parameter. First, the con- 
cept of the correlation between galaxies and mass shall be described more 
quantitatively. The mass density inhomogencities arc described, as before, 
by the dimensionless density contrast S(x,w). In analogy to this quantity, 
one defines the number density contrast <5 g (x, w) of galaxies as 

x i \ n{x.,w) - n(w) 

6 ^ w y-= — — ' { ] 

where n(x, w) is the number density of galaxies at comoving position x and 
comoving distance w (the latter providing a parameterization of cosmic time 
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or redshift), and h(w) is the mean number density of galaxies at that epoch. 
Since the galaxy distribution is discrete, the true number density is simply a 
sum of delta-functions. What is meant by n is that the probability of finding 
a galaxy in the volume dV situated at position x is n(x) dV. 

The relation between S and <5 g describes the relative distribution of galax- 
ies and matter in the Universe. The simplest case is that of an unbiased 
distribution, for which 5 g = 5; then, the probability of finding a galaxy at 
any location would be just proportional to the matter density. However, one 
might expect that the relation between luminous and dark matter is more 
complicated. For example, galaxies are expected to form preferentially in the 
high-density peaks in the early Universe, which would imply that there are 
proportionally more galaxies within mass overdensities. This led to the intro- 
duction of the concept of biasing (e.g., Bardeen et al. 1986; Kaiser 1984). The 
simplest form of biasing, called linear deterministic biasing, is provided by 
setting <5 g = b S, with b being the bias parameter. One might suspect that the 
relative bias is approximately constant on large scales, where the density field 
is still in its linear evolution (i.e., on scales ^ 10ft -1 Mpc today). On smaller 
scales, however, b most likely is no longer simply a constant. For example, 
the spatial distribution of galaxies in clusters seems to deviate from the ra- 
dial mass profile, and the distributions of different galaxy types are different. 
Furthermore, by comparing the clustering properties of galaxies of different 
types, one can determine their relative bias, from which it is concluded that 
more luminous galaxies are more strongly biased than less luminous ones, and 
early-type galaxies are more strongly clustered than late-types (see Norberg 
et al. 2001 and Zehavi et al. 2002 for recent results from the 2dFGRS and the 
SDSS). This is also expected from theoretical models and numerical simula- 
tions which show that more massive halos cluster more strongly (e.g., Shcth 
et al. 2001; Jing 1998). In order to account for a possible scale dependence of 
the bias, one considers the Fourier transforms of S and S g and relates them 
according to 

5 s (\s.,w) =b(\k\,w)5(k,w) , (137) 

thus accounting for a possible scale and redshift dependence of the bias. 

Even this more general bias description is most likely too simple, as it is 
still deterministic. Owing to the complexity of galaxy formation and evolu- 
tion, it is to be expected that the galaxy distribution is subject to stochastic- 
ity in excess to Poisson sampling (Tegmark & Peebles 1998; Dekel & Lahav 
1999). To account for that, another parameter is introduced, the correlation 
parameter r(|k|,w), which in general will also depend on scale and cosmic 
epoch. To define it, we first consider the correlator 

(<5(k, w )<5*(k») = (2irfS B (k-k')P Sg (\k\,w) , (138) 

where the occurrence of the delta function is due to the statistical homogene- 
ity of the density fields, and P$ g denotes the cross-power between galaxies 
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and matter. The correlation parameter r is then defined as 

r(|k|,«,) = ; MM (139) 

y/P S (\k\, W )P g (\k\,w) 

In the case of stochastic biasing, the definition of the bias parameter is mod- 
ified to 

P g (\k\,w) = b 2 (\k\,w)P 5 (\k\,w) , (140) 

which agrees with the definition (137) in the case of r = 1, but is more general 
since (140) no longer relates the phase of (the Fourier transform of) S g to that 
of S. Combining the last two equations yields 

P Sg (\k\,w) = b(\k\,w)r(\k\,w)P s (\k\,w) . (141) 

We point out again that galaxy redshift surveys are used to determine the 
two-point statistics of the galaxy distribution, and therefore P g ; in order to 
relate there measurements to P$, assumptions on the properties of the bias 
have to be made. As we shall discuss next, weak lensing can determine both 
the bias parameter and the correlation parameter. 



The principle. In order to determine b and r, the three power spectra 
defined above (or functions thereof) need to be measured. Second-order cos- 
mic shear measures, as discussed in Sect. 6, are proportional to the power 
spectrum Pg. The correlation function of galaxies is linearly related to P g . 
In particular, the three-dimensional correlation function is just the Fourier 
transform of P g , whereas the angular correlation function contains a projec- 
tion of P g along the linc-of-sight and thus follows from Limber's equation as 
discussed in Sect. 6.2. Finally, the cross-power P$ g describes the correlation 
between mass and light, and thus determines the relation between the lens- 
ing properties of the mass distribution in the Universe to the location of the 
galaxies. Galaxy-galaxy lensing on large angular scales (where the mass pro- 
file of individual galaxies no longer yields a significant contribution) provides 
one of the measures for such a correlation. Hence, measurements of these 
three statistical distributions allow a determination of r and b. 

As we shall consider projected densities, we relate the density field of 
galaxies on the sky to the spatial distribution. Hence, consider a population of 
('foreground') galaxies with spatial number density n(x, w). The number den- 
sity of these galaxies on the sky at is then N(0) — J dw v{w) n(fk(w)0, w), 
where v(w) is the redshift-dependent selection function, describing which 
fraction of the galaxies at comoving distance w are included in the sample. 
Foremost, this accounts for the fact that for large distances, only the more 
luminous galaxies will be in the observed galaxy sample, but v can account 
also for more subtle effects, such as spectral features entering or leaving the 
photometric bands due to redshifting. The mean number density of galax- 
ies on the sky is N = J dw v(w)n{w); the redshift distribution, or more 
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precisely, the distribution in comoving distance, of these galaxies therefore 
is Pf(w) — v(w) n(w) j N , thus relating the selection function v(w) to the 
rcdshift distribution. Using the definition (136), one then finds that 



N(0) = N 



1 



dw p{(w) 5 g (f K {w)0,w) 



(142) 



We shall denote the fractional number density by n g (0) := [N(0) — N~\ /N 
J dw pf(w)5 g (f K (w)0, w). 



Aperture measures. We have seen in Sect. 6.3 that the aperture mass 
dispersion provides a very convenient measure of second-order cosmic shear 
statistics. Therefore, it is tempting to use aperture measures also for the 
determination of the bias and the mass-galaxy correlation. Define in analogy 
to the definition of the aperture mass M ap in terms of the projected mass 
density the aperture counts (Schneider 1998), 

M{9) = j d 2 dU{\$\)K g {$) , (143) 

where the integral extends over the aperture of angular radius 9, and i9 mea- 
sures the position relative to the center of the aperture. An unbiased estimate 
of the aperture counts is N -1 U(\0i\), where the Oi are the positions of 
the galaxies. We now consider the dispersion of the aperture counts, 

(N 2 {9)) = J d 2 tf U{\4\) J dV W|) <« g (0) « g (0')> . (144) 

The correlator in the last expression is the angular two-point correlation 
function uj(Ad) of the galaxies; its Fourier transform is the angular power 
spectrum P w (I) of galaxies. Using the definition of n g together with the result 
(98) allows us to express in terms of the three-dimensional power spectrum 
of the galaxy distribution, 



PUD = j 



fxM \fK(w) J \f K (w) 



= b 2 [d W f^P s (" W ) , (145) 
J Ik( w ) \fK(w) J 

where we made use of (140), and in the final step we defined the mean bias 
parameter b which is a weighted average of the bias parameter over the red- 
shift distribution of the galaxies and which depends on the angular wave 
number I. To simplify notation, we shall drop the bar on b and consider the 
bias factor as being conveniently averaged over redshift (and later, also over 
spatial scale). The aperture count dispersion then becomes 

(M 2 (9)) = ^Jdl lP^)W ap (9£) = 2irb 2 H gg {9) , (146) 
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where W ap is given in (109), and we have defined 



with 



H gg (9) = J dw j^V{w,6) , (147) 

n W ,0) = ^J< i llPs( J ^ yW )w apm . (148) 

Using the same notation (following Hoekstra et al. 2002c), we can write the 
aperture mass dispersion as 

{Ml{6)) = ^(^y QlH K {6), (149) 

with 

H K {9)= J dw 9 ^V(w,e), (150) 

where g(w) (see eq. 94) describes the source-redshift weighted efficiency factor 
of a lens at distance w. One therefore obtains an expression for the bias factor, 

_ 9 (H y H K {9) (N*{0)) 

Note that /b(0) depends, besides the aperture radius 0, on the cosmological 
parameters i? m and Qa-, but for a given cosmological model, it depends only 
weakly on the filter scale 9 and on the adopted power spectrum P$ (van Waer- 
beke 1998; Hoekstra ct al. 2002c). This is due to the fact that both, (N 2 {9)) 
and (M^ (0)} are linear in the power spectrum, through the functions H, and 
in both cases they probe only a very narrow range of fc-values, owing to the 
narrow width of the filter function W ap . Hence, the ratio (A/" 2 (#)) / (M| p (#)) 
is expected to be very close to a constant if the bias factor b is scale indepen- 
dent. 

Next we consider the correlation coefficient r between the dark matter 
distribution and the galaxy field. Correlating M ap {9) with Af(9) yields 

(M ap {9)N{9)) = J d^um J d 2 tf' [/(|#'|)( K (#K(#')) 

= 3tt f — ) f2 m brH Kg (9) , (152) 



c 



with 

^)=/d»^(^). d53) 
J a(w)f K (w) 

It should be noted that (M ap (9)Af(9)) is a first-order statistics in the cos- 
mic shear. It correlates the shear signal with the location of galaxies, which 
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are assumed to trace the total matter distribution. As shown in Schneider 
(1998), the signal-to-noise of this correlator is higher than that of (Mf p ), and 
therefore was introduced as a convenient statistics for the detection of cosmic 
shear. In fact, in their original analysis of the RCS, based on 16deg 2 , Hoek- 
stra et al. (2001) obtained a significant signal for (M ap (9)Af(9)) , but not for 
(M% p (9)). Combining (146) and (149) with (152), the correlation coefficient 
r can be expressed as 

^H K (9)H gg (9) (M ap (0)M(9)) {M ap {9)M{9)) 

r = U 7o\ j = 1*^) I ■ 

H Kg (0) ^{Ml p (9)) (M*(6)) ^{Ml p (8)) (Af*(6)) 

(154) 

As was the case for /b, the function / r depends only very weakly on the filter 
scale and on the adopted form of the power spectrum, so that a variation 
of the (observable) final ratio with angular scale would indicate the scale 
dependence of the correlation coefficient. 

Whereas the two aperture measures M ap and Af can in principle be ob- 
tained from the data field by putting down circular apertures, and the corre- 
sponding second-order statistics can likewise be determined through unbiased 
estimators defined on these apertures, this is not the method of choice in prac- 
tice, due to gaps and holes in the data field. Note that in our discussion of 
cosmic shear in Sect. 6.3, we have expressed (M ap (9)} in terms of the shear 
two-point correlation functions £±(9) - see (115) - just for this reason. In 
close analogy, N 2 {9) can be expressed in terms of the angular correlation 
function lu(9) of the projected galaxy positions, as seen by (144), or more 
explicitly, when replacing the power spectrum P w (f) in (146) by its Fourier 
transform, which is the angular correlation function, one finds 

<^(«0>=jf , (155) 

where the function T + is the same as that occurring in (115). Correspondingly, 
we introduce the power spectrum P gK (£), which is defined as 

(k(£)k* g (£')) = (2tt) 2 S d (£ - £') P Kg {\£\) . (156) 

Applying (98), as well as the definitions of the bias and correlation functions, 
this projected cross-power spectrum is related to the 3-D density contrast by 

^W-lf^^^/d.^^pJ-^,.) . (157) 
2 V c / J a(w)f K (w) \fK(w) J 

The angular correlation function (k(i?)k(i9')) occurring in (152) can then be 
replaced by its Fourier transform P Kg . On the other hand, since the Fourier 
transform of the surface mass density k is simply related to that of the shear, 
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one can consider the correlation between the galaxy positions with the tan- 
gential shear component, 

<7t(0)> := (Kg(O)7t(0)) 

d 2 £ r d 2 e 





f < 


'-J 




1 


I 


" 2^ 









/ Wf e2i{0 '~ v) exp h* ■ ^ <** {t)k{t '^ 



(2tt) 2 J (2tt) 2 

.gW (158) 



Note that (7t(#)) is just the galaxy-galaxy lensing signal discussed in Sect. 8.2; 
this shows very clearly that galaxy-galaxy lensing measures the correlation 
of mass and light in the Universe. In terms of this mean tangential shear, the 
aperture mass and galaxy number counts can be written as 

f 20 dd d 

(M ap (W«)) =J ^- (7tW) T * (?) ' (159) 

where the function T 2 is defined in a way similar to T± and given explicitly 
as 

/°° dt 
^J 2 (xt) [J 4 (t)] 2 ; (160) 

this function vanishes for x > 2, so that the integral in (159) extends over 
a finite interval only. Hence, all three aperture correlators can be calculated 
from two-point correlation functions which can be determined from the data 
directly, independent of possible gaps in the field geometry. 



Results from the RCS. Hockstra et al. (2002c) have applied the foregoing 
equations to a combination of their RCS survey and the VIRMOS-DESCART 
survey. The former was used to determine (TV 2 ) and (M ap jV), the latter for 
deriving ^M 2 p ). As pointed out by these authors, this combination of surveys 
is very useful, in that the power spectrum at a redshift around z ~ 0.35 can be 
probed; indeed, they demonstrate that the effective redshift distribution over 
which the power spectrum, and thus b and r are probed, are well matched 
for all three statistics for their choice of surveys. 'Foreground' galaxies for 
the measurement of u>{6) and (7t(#)) are chosen to have 19.5 < Rc < 21, 
'background' galaxies are those with 21.5 < Rc < 24. In Fig. 53 the three 
aperture statistics are shown as a function of angular scale, as determined 
from their combined survey, whereas in the right panels, the ratios of these 
statistics as they appear in (151) and (154) are displayed. Also shown are 
predictions of these quantities from two cosmological models, assuming 6 = 1 
and r — 1. The fact that these model predictions are fairly constant in the 
right-hand panels shows that the factors /b and / r are nearly independent of 
the radius 9 of the aperture, as mentioned before. 
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Fig. 53. The left figure displays the three aperture statistics as measured by com- 
bining the RCS and the VIRMOS-DESCART survey. Points show measured values, 
as determined from the correlation functions. The right panels display the ratios of 
the aperture statistics as they appear in (151) and (154). The dotted and dashed 
curves in all panels show the predictions for an OCDM and a ylCDM model, re- 
spectively, both with flm = 0.3, as = 0.9, and i^pect = 0.21, for the fiducial values 
of b = 1 = r. The fact that the curves in the right panels are nearly constant show 
the near-independence of /b and / r on the filter scale. The upper axis in the right 
panels show the effective physical scale on which the values of b and r are measured 
(from Hockstra ct al. 2002c) 
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Fig. 54. The values of the bias 
and correlation coefficient, as de- 
termined from (151) and (154) 
and the results shown in Fig. 53; 
here, a /1CDM model has been as- 
sumed for the cosmology depen- 
dence of the functions /b and / r . 
The upper axis indicates the ef- 
fective scale on which 6 and r are 
measured (from Hoekstra et al. 
2002c) 
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The results for the bias and correlation factor are shown in Fig. 54, as 
a function of angular scale and effective physical scale, corresponding to a 
median rcdshift of z <~ 0.35. The results indicate that the bias factor and the 
galaxy-mass correlation coefficient are compatible with a constant value on 
large scales, 5/i _1 Mpc, but on smaller scales both seem to change with 
scale. The transition between these two regimes occurs at about the scale 
where the density field at redshift z ~ 0.35 turns from linear to non-linear 
evolution. In fact, in the non-linear regime one does not expect a constant 
value of both coefficients, whereas in the linear regime, constant values for 
them appear natural. It is evident from the figure that the error bars are 
still too large to draw definite conclusions about the behavior of b and r as a 
function of scale, but the approach to investigate the relation between galaxies 
and mass is extremely promising and will certainly yield very useful insight 
when applied to the next generation of cosmic shear surveys. In particular, 
with larger surveys than currently available, different cuts in the definition 
of foreground and background galaxies can be used, and thus the redshift 
dependence of b and r can be investigated. This is of course optimized if 
(photometric) redshift estimates for the galaxy sample become available. 

Results from the SDSS. The large sample of galaxies with spectroscopic 
rcdshifts already available now from the SDSS permits an accurate study 
of the biasing properties of these galaxies (see the end of Sect. 8.2). Two 
different approaches should be mentioned here: the first follows along the line 
discussed above and has been published in Sheldon et al. (2004). In short, the 
galaxy-galaxy signal can be translated into the galaxy-mass cross-correlation 
function £ gm , due to the knowledge of galaxy redshifts. The ratio of £ gm and 
the galaxy two-point correlation function £ gg then depends on the ratio r/b. 
In Fig. 55 we show the galaxy-mass correlation as a function of linear scale, as 
well as the ratio b/r. Note that from the SDSS no cosmic shear measurement 
has been obtained yet, owing to the complex PSF properties, and therefore 
b and r cannot be measured separately from this data set. 

The galaxy-mass correlation function follows a power law over more than 
two orders-of-magnitude in physical scale, and its slope is very similar to the 
slope of the galaxy two-point correlation function. Hence, the ratio between 
these two is nearly scale- independent. When splitting the sample into blue 
and red, and early- and late-type galaxies, the correlation length is larger for 
the red and the early-type ones. Furthermore, as expected, the lensing signal 
increases with the velocity dispersion in early-type galaxies. 

An alternative approach was taken by Seljak et al. (2004). Their starting 
point is the fact that the biasing properties of dark matter halos is very well 
determined from cosmological simulations. This is of course not true for the 
biasing of galaxies. The bias parameter of galaxies with luminosity L is given 
as 




(161) 
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Fig. 55. The galaxy-mass cross- 
correlation function £ gm (r), as a 
function of linear scale (dots with er- 
ror bars), scaled to a matter den- 
sity parameter of J? m = 0.27, as well 
as the two-point galaxy correlation 
function obtained from the same set 
of (foreground) galaxies (solid curve). 
The ratio between these two is given 
in the lower panel, which plots b/r 
as a function of scale. Over the full 
range of scales, £ gm can be well ap- 
proximated by a power law, £ gm = 
(r/r () y<, with slope 7 = 1.79 ± 0.06 
and correlation length ro = (5.4 ± 
0.7)(^ m /0.27)" 1/7 ^ 1 Mpc. The ra- 
tio r/b ~ (1.3±0.2)(J? m /0.27) is con- 
sistent with being scale-independent 
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where bh is the bias of halos of mass M relative to the large-scale matter 
distribution, and p(M\L) is the probability that a galaxy with luminosity 
L resides in a halo of mass M. This latter probability distribution is then 
parameterized for any luminosity bin, by assuming that a fraction 1 — a of 
all galaxies in the luminosity bin considered are at the center of their parent 
halos, whereas the remaining fraction a are satellite galaxies. For the central 
galaxies, a unique mass M(L) is assigned, whereas for the non-central ones, 
a mass distribution is assumed. The values of a and M for six luminosity 
bins are shown in the various panels of Fig. 51; they are obtained by fitting 
the galaxy-galaxy lensing signal with the model just described. The main 
reason why the mass spectrum can be probed is that the numerous low-mass 
galaxy halos contribute to the lensing signal only at relatively small scales, 
whereas at larger scales the higher-mass halos dominate the signal; hence, 
different halo masses appear at different separations in the galaxy-galaxy 
lensing signal. In this way, b(L) can be determined, which depends on the non- 
linear mass scale M* (see Sect. 6.2 of IN). The bias parameter is a relatively 
slowly varying function of galaxy luminosity for L ^ , approaching a value 
^0.7 for very low- luminosity galaxies, but quickly rises for L > L*. 

Seljak et al. combined these measurements of the bias parameter with 
the clustering properties of the SDSS galaxies and the WMAP results on the 
CMB anisotropy, and derived new constraints on as = 0.88±0.06 and the bias 
parameter of an L»-galaxy, 6* = 0.99 ±0.07; furthermore, the combination of 
these datasets is used to obtain new constraints on the standard cosmological 
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parameters. This work has opened up a new way on how to employ the results 
from galaxy-galaxy lensing as a cosmological tool. 

8.4 Galaxy biasing: magnification method 

High-redshift QSOs are observed to be correlated on the sky with lower- 
redshift galaxies and clusters. This topic has indeed an interesting history: 
The detection of very close associations of high-z QSOs with low- z galaxies 
(see Arp 1987, and references therein) has been claimed as evidence against 
the cosmological interpretation of the QSO redshifts, as the probabilities of 
observing such close pairs of objects which are physically unrelated were 
claimed to be vanishingly small. However, these probabilities were obtained 
a posteriori, and of course, any specific configuration has a vanishingly small 
probability. Since the cosmological interpretation of QSO redshifts is sup- 
ported by overwhelming evidence, the vast majority of researchers consider 
these associations as a statistical fluke. 

A physical possibility to generate the association of background sources 
with foreground objects is provided by the magnification bias caused by lens- 
ing: the number counts of background sources is changed in regions where a 
foreground lens yields magnifications different from unity - see Sect. 5 of IN. 
Thus, close to a galaxy where [i > 1 , the number counts of bright background 
QSOs can be enhanced since the slope of their counts is steeper than unity. 
There have been various attempts in the literature to 'explain' the observed 
QSO-galaxy associations by invoking the magnification bias, either with a 
smooth galaxy mass distribution or by including the effects of microlensing; 
see SEF for a detailed discussion of this effect. The bottom line, however, 
is that the magnification effect is by far not large enough to account for the 
small (a posteriori) probabilities of the observed individual close associations. 

The topic has be revived, though in a different direction, by the finding 
that high-redshift AGNs are statistically associated with low-redshift galax- 
ies. Fugmann (1990) provided evidence that radio-selected high-z AGNs from 
the 1-Jansky-catalog are correlated with relatively bright (and therefore low- 
z) galaxies taken from the Lick catalog, an analysis that later on was repeated 
by Bartelmann & Schneider (1993), using a slightly different statistics. Differ- 
ent samples of foreground and background populations have been employed in 
further studies, including the correlation between 1-Jansky AGN with bright 
IRAS galaxies (Bartelmann & Schneider 1994; Bartsch et al. 1997), high-z 
QSOs with clusters from the Zwicky catalog of clusters (Rodrigues- Williams 
& Hogan 1994; Seitz & Schneider 1995b), 1-Jansky AGNs with red galaxies 
from the APM catalog (Bemtez & Martinez-Gonzalez 1995; see also Norman 
& Impey 2001), to mention just a few. Radio-selected AGN are considered 
to be a more reliable probe since their radio flux is unaffected by extinction, 
an effect which could cause a bias (if the sky shows patchy extinction, both 
galaxies and QSOs would have correlated inhomogeneous distributions on 
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the sky) or anti-bias (if extinction is related to the lensing matter) for flux- 
limited optical surveys of AGNs, and which therefore needs to be taken into 
account in the correlation analysis of optically-selected AGNs. However, most 
radio source catalogs are not fully optically identified and lack redshifts, and 
using incomplete radio surveys therefore can induce a selection bias (Bcm'tcz 
et al. 2001). These latter authors investigated the correlation between two 
completely identified radio catalogs with the COSMOS galaxy catalog, and 
found a very significant correlation signal. 

The upshot of all these analyses is that there seems to be a positive 
correlation between the high-z sources and the low-z objects, on angular 
scales between <~ 1' and about 1°. The significances of these correlations are 
often not very large, they typically are at the 2-3a level, essentially limited 
by the finite number of high-redshift radio sources with a large flux (the 
latter being needed for two reasons: first, only radio surveys with a high 
flux threshold, such as the 1-Jansky catalog, have been completely optically 
identified and redshifts determined, which is necessary to exclude low-redshift 
sources which could be physically associated with the 'foreground' galaxy 
population, and second, because the counts are steep only for high fluxes, 
needed to obtain a high magnification bias.) If this effect is real, it cannot 
be explained by lensing caused by individual galaxies; the angular region on 
which galaxies produce an appreciable magnification is just a few arcseconds. 
However, if galaxies trace the underlying (dark) matter distribution, the latter 
can yield magnifications (in the same way as it yields a shear) on larger scales. 
Thus, an obvious qualitative interpretation of the observed correlation is 
therefore that it is due to magnification of the large-scale matter distribution 
in the Universe of which the galaxies are tracers. This view is supported by 
the finding (Menard & Peroux 2003) that there is a significant correlation 
of bright QSOs with metal absorption systems in the sense that there are 
relatively more bright QSOs with an aborber than without; this effect shows 
the expected trend from magnification bias caused by matter distributions 
associated with the absorbing material. 

We therefore consider a flux-limited sample of AGNs, with distance prob- 
ability distribution pq(w), and a sample of galaxies with distance distribution 
Pf(w). It will be assumed that the AGN sample has been selected such that 
it includes only objects with redshift larger than some threshold z m i n , cor- 
responding to a minimum comoving distance w m i n , which is larger than the 
distances of all galaxies in the sample. We define the AGN-galaxy correlation 
function as 

(\NJd>) - NJ \N Q (<f) + 0) - Nq\) 
N g N Q 



where N g (cf>) and Nq(4>) are the observed number densities of galaxies and 
AGNs, respectively. The former is given by (142). The observed number den- 
sity of AGN is affected by the magnification bias. Provided the unlensed 
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counts can be described (locally) as a power-law in flux, 7Vq j0 (> S) oc S 13 , 
then from (108) of IN we find that N Q ((j>) = TVq^ /U /3_1 (0), where /z(</>) is the 
magnification in the direction (p. Then, if the magnifications that are relevant 
are small, we can approximate 



and the projected surface mass density k is given by (93) with p w in (94) 
replaced by pq. Assuming that the magnifications do not affect the mean 
source counts Nq, the cross-correlation becomes 



where b and f are the effective bias factor of the galaxies and the mean 
galaxy-mass correlation function just as in Sect. 8.3, and w Kg is the correlation 
between the projected density field n and the projected number density of 
galaxies n g7 defined after (142), which is the Fourier transform of P K& {t) 
defined in (141). Hence, a measurement of this correlation, together with a 
measurement of the correlation function of galaxies, can constrain the values 
of b and r (Dolag & Bartelmann 1997; Menard & Bartelmann 2002). 

The observed correlation between galaxies and background AGN appears 
to be significantly larger than can be accounted for by the models presented 
above. On scales of a few arcmin, Benitez et al. (2001) argued that the ob- 
served signal exceeds the theoretical expectations by a factor of a few. This 
discrepancy can be attributed to either observational effects, or shortcomings 
of the theoretical modelling. Obviously, selection effects can easily produce 
spurious correlations, such as patchy dust obscuration or a physical associa- 
tion of AGNs with the galaxies. Furthermore, the weak lensing approximation 
employed above can break down on small angular scales. Jain ct al. (2003, 
see also Takada & Hamana 2003) argued that the simple biasing model most 
likely breaks down for the small scales where the discrepancy is seen, and 
employed the halo model for describing the large-scale distribution of matter 
and galaxies to predict the expected correlations. For example, the strength 
of the signal depends sensitively on the redshifts, magnitudes and galaxy 
type. 

At present, the shear method to determine the bias factor and the galaxy- 
mass correlation has yielded more significant results than the magnification 
method, owing to the small complete and homogeneous samples of high- 
redshift AGNs. As pointed out by Menard & Bartelmann (2002), the SDSS 
may well change this situation shortly, as this survey will obtain <~ 10 5 ho- 
mogeneously selected spectroscopically verified AGNs. Provided the effects 
of extinction can be controlled sufficiently well, this data should provide a 
precision measurement of the QSO-galaxy correlation function. 



M (0) w 1 + 2k(0) = 1 + Spit) , 



(163) 



w Qg (6) = 2((3- 1)1(0) f(9)w KS (9) 



(164) 
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9 Additional issues in cosmic shear 
9.1 Higher-order statistics 

On the level of second-order statistics, 'only' the power spectrum is probed. 
If the density field was Gaussian, then the power spectrum would fully char- 
acterize it; however, in the course of non-linear structure evolution, non- 
Gaussian features of the density field are generated, which show up corre- 
spondingly in the cosmic shear field and which can be probed by higher-order 
shear statistics. The usefulness of these higher-order measures for cosmic 
shear has been pointed out in Bernardeau et al. (1997), Jain & Scljak (1997), 
Schneider et al. (1998a) and van Waerbeke et al. (1999); in particular, the 
near-degeneracy between as and J? m as found from using second-order statis- 
tics can be broken. However, there are serious problems with higher-order 
shear statistics, that shall be illustrated below in terms of the third-order 
statistics. 

But first, we can give a simple argument why third-order statistics is able 
to break the degeneracy between /2 m and as- Consider a density field on a 
scale where the inhomogeneities are just weakly non-linear. One can then em- 
ploy second-order perturbation theory for the growth of the density contrast 
8. Hence, we write 8 = 8^ + 8^ + . . ., where 8^ is the density contrast ob- 
tained from linear perturbation theory, and 8^ is the next-order term. This 
second-order term is quadratic in the linear density field, 8^ oc (8^) . The 
linear density field is proportional to as, and the projected density k oc f2 m as- 
Hence, in the linear regime, (k 2 ) oc J^crf , where (k 2 ) shall denote here any 
second-order shear estimator. The lowest order contribution to the third- 
order statistics is of the form 

(^)oc(s^) 2 S^ocnlai, 

since the term (f^ 1 )) 3 yields no contribution owing to the assumed Gaussian- 
ity of the linear density field. Hence, a skewness statistics of the form 

<. 3 ) / < K 2 ) 2 cx f2-J 

will be independent of the normalization as, at least in this simplified per- 
turbation approach. In more accurate estimates, this is not exactly true; nev- 
ertheless, the functional dependencies of the second- and third-order shear 
statistics on cr 8 and Q m are different, so that these parameters can be deter- 
mined separately. 

The shear three-point correlation function. Most of the early stud- 
ies on three-point statistics concentrated on the third-order moment of the 
surface mass density n in a circular aperture, (k(9)}; however, this is not a 
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directly measureable quantity, and therefore useful only for theoretical con- 
siderations. As for second-order statistics, one should consider the correlation 
functions, which are the quantities that can be obtained best directly from 
the data and which are independent of holes and gaps in the data field. 
The three-point correlation function (3PCF) of the shear has three indepen- 
dent variables (e.g. the sides of a triangle) and 8 components; as was shown 
in Schneider & Lombardi (2003), none of these eight components vanishes 
owing to parity invariance (as was suspected before - this confusion arises 
because little intuition is available on the properties of the 3PCF of a po- 
lar). This then implies that the covariance matrix has 6 arguments and 64 
components! Of course, this is too hard to handle efficiently, therefore one 
must ask which combinations of the components of the 3PCF are most useful 
for studying the dark matter distribution. Unfortunately, this is essentially 
unknown yet. An additional problem is that the predictions from theory are 
less well established than for the second-order statistics. 

A further complication stems from a certain degree of arbitrariness on how 
to define the 8 components of the 3PCF. For the 2PCF, the vector between 
any pair of points defines a natural direction with respect to which tangential 
and cross components of the shear are defined; this is no longer true for three 
points. On the other hand, the three points of a triangle define a set of centers, 
such as the 'center of mass', or the center of the in- or circum-circle. After 
choosing one of these centers, one can define the two components of the shear 
which are then independent of the coordinate frame. 

Nevertheless, progress has been achieved. From ray-tracing simulations 
through a cosmic matter distribution, the 3PCF of the shear can be de- 
termined (Takada & Jain 2003a; see also Zaldarriaga & Scoccimarro 2003; 
furthermore, the three-point cosmic shear statistics can also be determined in 
the frame of the halo model, see Cooray & Hu 2001; Takada & Jain 20031)), 
whereas Schneider & Lombardi (2003) have defined the 'natural components' 
of the shear 3PCF which are most easily related to the bispectrum of the 
underlying matter distribution. Let 7 c (0j) = 7t + i7x = — 7C~ 2ll > i be the 
complex shear measured in the frame which is rotated by the angle Q rela- 
tive to the Cartesian frame, so that the real and imaginary parts of 7° are the 
tangential and cross components of the shear relative to the chosen center of 
the triangle (which has to be defined for each triplet of points separately). 
Then the natural components are defined as 

r(°) = (7 c (0 1 )7 c (0 2 )7 c (0 3 )) , 

r« = (7 c *(0i)7 c (0 2 )7 c (0 3 )> - ( 165 ) 

and correspondingly for and r^ 3 \ Each of the natural components of the 
3PCF constitutes a complex number, which depends just on the three separa- 
tions between the points. Special care is required for labelling the points, and 
one should follow the rule that they are labeled in a counter-clock direction 
around the triangle. If such a unique prescription is not systematically ap- 
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plied, confusing and wrong conclusions will be obtained about the behaviour 
of the shear 3PCF with respect to parity transformations (as the author has 
experienced painfully enough). In Schneider et al. (2004), explicit relations 
are derived for the natural components of the shear 3PCF in terms of the bis- 
pectrum (that is, the generalization of the power spectrum for the three-point 
statistics) of the underlying mass distribution n. 

Third-order aperture statistics. Alternatively, aperture measures can be 
defined to measure the third-order statistics. Schneider et al. (1998a) calcu- 
lated (Aff p ) (9) in the frame of the quasi- linear structure evolution model 
and showed it to be a strong function of Q m . Van Waerbeke et al. (2001) 
calculated the third-order aperture mass, using a fitting formula of the non- 
linear evolution of the dark matter bispectrum obtained by Scoccimarro & 
Couchman (2001) and pointed out the strong sensitivity with respect to cos- 
mological parameters. Indeed, as mentioned before, (Mf p ) is sensitive only 
to the E- modes of the shear field. One might be tempted to use (Mj^) (9) as 
a measure for third-order B-mode statistics, but indeed, this quantity van- 
ishes owing to parity invariance (Schneider 2003). However, (M]_ M ap ) is 
a measure for the B-modes at the third-order statistical level. Jarvis et al. 
(2004) have calculated (Mf p (6»)) in terms of the shear 3PCF, for the weight 
function (110) in the definition of M ap . Schneider et al. (2004) have shown 
that this relation is most easily expressed in terms of the natural components 
of the shear 3PCF. On the other hand, Jarvis et al. (2004) have expressed 
(M^ p (9)^ in terms of the bispectrum of k, and as was the case for the aperture 
dispersion in relation to the power spectrum of k, the third-order aperture 
mass is a very localized measure of the bispectrum and is sensitive essen- 
tially only to modes with three wavevectors with equal magnitudes. For that 
reason, Schneider et al. (2004) have generalized the definition of the third- 
order aperture measures, correlating the aperture mass of three different sizes, 
(M ap (#i) M ap (#2) -Map (#3)) • This third-order statistics is again a very local- 
ized measure of the bispectrum, but this time with wave vectors of different 
magnitude £j m Tt/9i, and therefore, by considering the third-order aperture 
mass for all combinations of 9i, one can probe the full bispectrum. Therefore, 
the third-order aperture mass correlator with three independent arguments 
(i.e., angular scales) should contain essentially the full third-order statistical 
information of the K-field, since in contrast to the two-point statistics, the 
shear 3PCF does not contain information about long-wavelength modes. 

Furthermore, the third-order aperture statistics can be expressed directly 
in terms of the shear 3PCF through a simple integration, very similar to 
the relations (125) for the two-point statistics. Finally, the other three third- 
order aperture statistics (e.g., (M±(9i)M ap (92)M ap (9 3 ))) can as well be ob- 
tained from the natural components of the shear 3PCF. These correlators 
are expected to vanish if the shear is solely due to lensing, but intrinsic 
alignments of galaxies can lead to finite correlators which include B-modes. 
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However, as shown in Schneider (2003), (M a , p (6 1 )M a , p (9 2 )M±(6 3 )}, as well as 
{M±{9i)M±(92)M±(9^)) , are expected to vanish even in the presence of B- 
modes, since these two correlators are not invariant with respect to a parity 
transformation. Therefore, non-zero results of these two correlators signify 
the violation of parity invariance and therefore provide a clean check on the 
systematics of the data and their analysis. 

First detections. Bernardeau et al. (2002) measured for the first time a 
significant third-order shear from the VIRMOS-DESCART survey, employing 
a suitably filtered integral over the measured 3PCF (as defined in Bernardeau 
et al. 2003). Pen et al. (2003) used the aperture statistics to detect a skewness 
in the same data set. The accuracy of these measurements is not sufficient to 
derive strong constraints on cosmological parameters, owing to the limited 
sky area available. However, with the upcoming large cosmic shear surveys, 
the 3PCF will be measured with high accuracy. Determining the 3PCF from 
observed galaxy ellipticities cannot be done by straightforwardly considering 
any triple of galaxies - there are just too many. Jarvis et al. (2004) and 
Zhang & Pen (2003) have developed algorithms for calculating the 3PCF in 
an efficient way. 

Based on the halo model for the description of the LSS, Takada & Jain 
(2003b) studied the dependence of the shear 3PCF on cosmological param- 
eters. For relatively large triangles, the 3PCF provides a means to break 
the degeneracies of cosmological parameters that are left when using the 
second-order statistics only, as argued above. For small triangles, the 3PCF 
is dominated by the one-halo term, and therefore primarily probes the mass 
profiles of halos. Ho & White (2004) show that the 3PCF on small angular 
scales also contains information on the asphcricity of dark matter halos. The 
full power of third-order statistics is achieved once redshift information on 
the source galaxies become available, in which case the combination of the 
2PCF and 3PCF provides a sensitive probe on the equation-of-state of the 
dark energy (Takada & Jain 2004). 

Beyond third order. One might be tempted to look into the properties 
of the fourth-order shear statistics (though I'm sure the reader can control 
herself in doing this - but see Takada & Jain 2002). OK, the four-point 
correlation function has 16 components and depends on 5 variables, not to 
mention the corresponding covariance or the redshift dependent fourth-order 
correlator. One can consider correlating the aperture mass of four different 
angular sizes, but in contrast to the third-order statistics, this is expected 
not to contain the full information on the trispectrum (which describes the 
fourth-order statistical properties of n). Perhaps a combination of this fourth- 
order aperture mass with the average of the fourth power of the mean shear 
in circular apertures will carry most of the information. And how much in- 
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formation on cosmological parameters does the fourth-order shear statistics 
contain? And even higher orders? 

Already the third-order shear statistic is not acccurately predictable from 
analytic descriptions of the non- linear evolution of the matter inhomogeneities, 
and the situation worsens with even higher order. 13 One therefore needs to 
refer to detailed ray-tracing simulations. Although they are quite time con- 
suming, I do not see a real bottleneck in this aspect: Once a solid and accu- 
rate measurement of the three-point correlation function becomes available, 
certainly considerable effort will be taken to compare this with numerical 
simulations (in particular, since such a measurement is probably a few years 
ahead, in which the computer power will increase by significant factors). If 
we accept this point, then higher-order statistics can be obtained from these 
simulations, and several can be 'tried out' on the numerical data such that 
they best distinguish between different models. For example, one can con- 
sider the full probability distribution p(M ap ; 9) on a given data set (Kruse & 
Schneider 2000; Reblinsky et al 1999; Bernardeau & Valageas 2000; Munshi 
et al. 2004). To obtain this from the observational data, one needs to place 
apertures on the data field which, as we have argued, is plagued with holes 
and gaps in the data. However, we can place the same gaps on the simu- 
lated data fields and therefore simulate this effect. Similarly, the numerical 
simulations should be used to find good strategies for combining second- and 
third-order shear statistics (and potentially higher-order ones) for an opti- 
mal distinction between comological model parameters, and, in particular, 
the equation-of-state of Dark Energy. Another issue one needs to consider for 
third- (and higher-)order cosmic shear measures is that intrinsic clustering of 
sources, and the correlation between galaxies and the dark matter distribu- 
tion generating the shear shear field has an influence on the expected signal 
strength (Bernardeau 1998; Hamana 2001; Hamana et al. 2002). Obviously, 
there are still a lot of important studies to be done. 

Third-order galaxy-mass correlations. We have shown in Sect. 8 how 
galaxy-galaxy lensing can be used to probe the correlation between galax- 
ies and the underlying matter distribution. With the detection of third-order 
shear statistics already in currently available data sets, one might expect that 
also higher-order galaxy-mass correlations can be measured from the same 
data. Such correlations would then probe, on large angular scales, the higher- 
order biasing parameters of galaxies, and thereby put additional constraints 

13 In the limits of small and large angular scales, analytic approximations can be 
obtained. For small scales, the highly non-linear regime is often described by the 
hierarchical ansatz and hyperextended perturbation theory (see Munshi & Jain 
2001 and references therein), whereas on very large scales second-order perturba- 
tion theory can be used. Nevertheless, the range of validity of these perturbation 
approximations and their accuracy have to be checked with numerical simula- 
tions. 
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on the formation and evolution of galaxies. Menard et al. (2003) considered 
the correlation between high-redshift QSOs and pairs of foreground galax- 
ies, thus generalizing the methods of Sect. 8.4 to third-order statistics. The 
galaxy-galaxy-shear correlation, and the galaxy-shear-shear correlations have 
been considered by Schneider & Watts (2004). These correlation functions 
have been related to the underlying bispectrum of the dark matter and the 
third-order bias and correlation functions, and appropriate aperture statistics 
have been defined, that are related in a simple way to the bispectra and the 
correlation functions. 

In fact, integrals of these higher-order correlations have probably been 
measured already. As shown in Fig. 50, galaxies in regions of high galaxy 
number densities show a stronger, and more extended galaxy-galaxy lensing 
signal than more isolated galaxies. Hence there is a correlation between the 
mean mass profile around galaxies and the local number density of galaxies, 
which is just an integrated galaxy-galaxy-shear correlation. In fact, such a 
correlation is only first order in the shear and should therefore be much 
easier to measure than the shear 3PCF. Furthermore, the galaxy-shear-shear 
correlation seems to be present in the cosmic shear analysis of the COMBO- 
17 fields by Brown et al. (2003), where they find a stronger-than-average 
cosmic shear signal in the A901 field, and a weaker cosmic shear signal in the 
CDFS, which is a field selected because it is rather poor in brighter galaxies. 

9.2 Influence of LSS lensing on lensing by clusters and galaxies 

The lensing effect of the three-dimensional matter distribution will contam- 
inate the lensing measurements of localized objects, such as galaxies and 
clusters. Some of the associated effects are mentioned in this section. 

Influence of cosmic shear on strong lensing by galaxies. The lensing 
effect of foreground and background matter in a strong lensing system will 
affect the image positions and flux ratios. As this 3-D lensing effects are not 
recognized as such in the lens modelling, a 'wrong' lens model will be fitted 
to the data, in the sense that the mass model for the lensing galaxy will 
try to include these additional lensing effects not associated with the galaxy 
itself. In particular, the corresponding predictions for the time delays can be 
affected through this effect. 

Since the image separation of strong lens systems are less than a few 
arcseconds, the lensing effect of the LSS can be well approximated by a linear 
mapping across this angular scale. In this case, the effect of the 3-D matter 
distribution on the lens model can be studied analytically (e.g., Bar-Kana 
1996). The lens equation resulting from the main lens (the galaxy) plus the 
linearized inhomogencities of the LSS is strictly equivalent to the single-plane 
gravitational lens equation without these cosmological perturbations, and the 
mass distribution of the equivalent single-plane lens can be explicitly derived 
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(Schneider 1997). For example, if the main lens is described by elliptical 
isopotcntial curves (i.e., elliptical contours of the deflection potential ip) plus 
external shear, the equivalent single-plane lens will be of the same form. The 
orientation of the ellipticity of the lens, as seen by the observer, will be rotated 
by the forground LSS by the same angle as the potential of the equivalent lens, 
so that no observable misalignment is induced. This equivalence then implies 
that the determination of the Hubble constant from time-delay measurements 
is affected by the same mass-sheet degeneracy transformation as for a single 
plane lens. 

LSS effects on the mass determination of clusters. The determination 
of mass parameters of a cluster from weak lensing is affected by the inhomo- 
geneous foreground and background matter distribution. The effect of local 
mass associated with a cluster (e.g., filaments extending from the cluster 
along the line-of-sight) will bias the mass determination of clusters high, 
since clusters are likely to be located in overdense regions of the LSS, though 
this effect is considerably smaller than claimed by Metzler et al. (2001), as 
shown by Clowe et al. (2004a). 

Hockstra (2001, 2003) considered the effect of the LSS on the determina- 
tion of mass parameters of clusters, using either SIS or NFW models. For the 
SIS model, the one parameter characterizing this mass profile (a v ) can be 
obtained as a linear estimator of the shear. The dispersion of this parameter 
is then the sum of the dispersion caused by the intrinsic ellipticity of the 
source galaxies and the cosmic shear dispersion. For the NFW model, the 
relation between its two parameters (-M200, the mass inside the virial radius 
r2oo, and the concentration c) and the shear is not linear, but the effect of 
the LSS can still be estimated from Monte-Carlo simulations in which the 
cosmic shear is assumed to follow Gaussian statistics with a power spectrum 
following the Peacock & Dodds (1996) prescription. 

For the SIS model, the effect of the LSS on the determination of a v is 
small, provided the cluster is at intermediate redshift (so that most source 
galaxies are in the background). The noise caused by the finite ellipticity 
in this case is almost always larger than the effect by the LSS. There is an 
interesting effect, however, in that the relative contribution of the LSS and 
shape noise changes as larger aperture fits to the SIS model are considered: 
The larger the field over which the shear is fitted to an SIS model, the larger 
becomes the impact of cosmic shear, and this increase compensates for the 
reduced shape noise. In effect, cosmic shear and shape noise together put an 
upper limit on the accuracy of the determination of a v from shear data. The 
same is true for the determination of the mass parameters of the NFW model, 
as shown in Fig. 56. The uncertainties of the mass parameters of NFW profiles 
are about twice as large as if the effects from the LSS are ignored, whereas 
the effect is considerably smaller for the one-parameter model of the SIS. One 
should also note that a decrease of the shape noise, which can be obtained by 
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using data with a fainter limiting magnitude, yields an increase of the noise 
from the LSS, since the fainter galaxies are expected to be at higher redshift 
and therefore carry a larger cosmic shear signal. For low-redshift clusters, 
these two effects nearly compensate. 
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Fig. 56. The dispersion of the determination of the mass and concentration of 
three NFW halos at redshift z<j = 0.3. These parameters were derived by fitting 
an NFW shear profile to the shear simulated from an NFW halo with parameters 
indicated in the figure and adding shape noise and noise from cosmic shear. The 
outer angular scale over which the fit was performed is 6> max . Dotted curves show 
the effect from shape noise alone, dashed curves show the dispersion from cosmic 
shear, and the solid curves contain both effects. Surprisingly, the accuracy of the 
NFW parameters does not increase once # ma x ~ 15' is reached, as for larger radii, 
the cosmic shear noise more than compensates for the reduced ellipticity noise. 
Another way to express that is that the lensing signal at very large distance from 
the halo center is weaker than the rms cosmic shear and therefore does not increase 
the signal-to-noise any more (from Hoekstra 2003) 



The efficiency and completeness of weak lensing cluster searches. 

We take up the brief discussion at the end of Sect. 5.8 about the potential 
of deriving a shear-selected sample of galaxy clusters. The first studies of 
this question were based on analytical models (e.g., Kruse & Schneider 1999) 
or numerical models of isolated clusters (Reblinsky & Bartelmann 1999). 
Those studies can of course not account for the effects of lensing by the LSS. 
Ray-tracing simulations through N-body generated LSS were carried out by 
Reblinsky et al. (1999), White et al. (2002), Hamana et al. (2004), Vale & 
White (2003), Hennawi & Spergel (2004) and others. In these cosmological 
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simulations, halos were identified based on their 3-D mass distribution. They 
were then compared to the properties of the lcnsing results obtained from ray 
tracing, either by considering the (smoothed) surface mass density k (that 
could be obtained from a mass reconstruction from the shear field) or by 
studying the aperture mass M ap which can be obtained directly from the 
shear. In both cases, noise due to the finite intrinsic source ellipticity can be 
added. 

The two basic quantities that have been investigated in these studies are 
completeness and efficiency. Completeness is the fraction of dark matter ha- 
los above some mass threshold M m i n that are detected in the weak lensing 
data, whereas efficiency is the fraction of significant lensing detections that 
correspond to a real halo. Both of these quantities depend on a number of 
parameters, like the mass threshold of a halo and the limiting significance 
v of a lensing detection [in the case of the aperture mass, this would corre- 
spond to (80)], as well as on the choice of the filter function Q. Hennawi & 
Spcrgcl (2003) have pointed out that even without noise (from observations 
or intrinsic galaxy cllipticities), the efficiency is limited to about 85% - even 
under these idealized condition, the selected sample will be contaminated by 
at least 15% of spurious detections, generated by projection effects of the 
LSS. 

To compare these predictions with observations, the six highest-redshift 
EMSS clusters were all detected at high significance with a weak lensing anal- 
ysis (Clowe et al. 2000). Clowe et al. (2004b) have studied 20 high-redshift 
clusters with weak lensing techniques. These clusters were optically selected 
and are expected to be somewhat less massive (and potentially more affected 
by foreground galaxies) than the EMSS clusters. Only eight of these 20 clus- 
ters are detected with more than 3cr significance, but for none of them does 
the SIS fit produce a negative er^. Only for four of these clusters are the 
lensing results compatible with no shear signal. 

10 Concluding remarks 

Weak lensing has become a standard tool in observational cosmology, as we 
have learned how to measure the shape of faint galaxy images and to correct 
them for distortions in the telescope and camera optics and for PSF effects. 
These technical issues are at the very center of any observational weak lens- 
ing research. It appears that at present, the accuracy with which shear can 
be measured is sufficient for the data available today, in the sense that sta- 
tistical uncertainties arc likely to be larger than potential inaccuracies in the 
measurement of unbiased shear estimates from faint images. This, however, 
will change quickly. The upcoming large cosmic shear surveys will greatly re- 
duce statistical uncertainties, and then the accuracy of shear measurements 
from the data will be the essential limiting factor. Alternatives to KSB have 
been developed, but they need to undergo thorough testing before becoming 
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a standard tool for observers. It should also be noted that the KSB method 
is applied differently by different groups, in particular with regards to the 
weighting of galaxies and other details. What is urgently needed is a study in 
which different groups apply their version of KSB to the same data set and 
compare the results. Furthermore, starting from raw data, the specific data 
reduction methods will lead to slightly different coadded images, and shear 
measurements on such differently reduced imaged should be compared. These 
technical issues will be a central challenge for weak lensing in the upcoming 
years. 

The ongoing and planned wide-field imaging surveys mentioned at the 
end of Sect. 7.7 will allow us to investigate several central questions of cos- 
mology. The two aspects that I consider most relevant are the investigation 
of the equation-of-state of the Dark Energy and the relation between galaxies 
and the underlying dark matter distribution. The former question about the 
nature of Dark Energy is arguably the central challenge of modern cosmol- 
ogy, and cosmic shear is one of the very few methods how it can be studied 
empirically. The relation between dark matter and galaxies is central to our 
understanding of how galaxies form and evolve, and galaxy-galaxy lensing is 
the only way how this relation can be investigated without a priori assump- 
tions. 

Essentially all weak lensing studies today have used faint galaxies as 
sources, since they form the densest source population currently observable. 
The uniqueness of faint optical galaxies will not stay forever, with the cur- 
rently planned future instruments. For example, there is a rich literature of 
weak lensing of the cosmic microwave background which provides a source 
of very accurately known redshift. Weak lensing by the large-scale structure 
enhances the power spectrum of the CMB at small angular scales, and the 
Planck satellite will be able to measure this effect. In particular, polarization 
information will be very useful, since lensing can introduce B-modes in the 
CMB polarization. The James Webb Space Telescope, with its large aper- 
ture of 6.5 meters and its low temperature and background will increase the 
number density of observable faint sources in the near-IR up to 5/^m to sev- 
eral hundred per square arcminute, many of them at redshifts beyond 3, and 
will therefore permit much more detailed weak lensing studies, in particu- 
lar of clusters (see Fig. 57; an observation of this huge number of arcs and 
multiple images will answer questions about the mass distribution of clusters 
that we have yet not even dared to ask) . The envisioned next generation ra- 
dio telescope Square Kilometer Array will populate the radio sky with very 
comparable source density as currently the deepest optical images. Since the 
beam (that is, the point-spread function) of this radio interferometer will be 
known very accurately, PSF corrections for this instrument will be more reli- 
able than for optical telescopes. Furthermore, higher-order correlation of the 
shear field with sources in the field will tell us about non-Gaussian properties 
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Fig. 57. Simulated image of lensed features in the very central part of the massive 
cluster A2218, as observed with the future JWST. For these simulations, the mass 
profile of the cluster as constrained from HST observations and detailed modelling 
(Kneib et al. 1996) has been used. The number density of (unlensed) sources was 
assumed to be 4 x 10 6 deg -2 down to K=29. The redshift distribution assumed is 
broad and extends to redshift z ~ 10 with a median value z mc d ~ 3. The brighter 
objects (cluster galaxies and brightest arcs) seen by HST are displayed as contours, 
to make the faint galaxy images visible on this limited dynamic range reproduction. 
An enormous number of large arcs and arclets are seen; in particular, numerous 
radial arcs can be easily detected, which will allow us to determine the 'core size' 
of the cluster mass distribution. Due to the broad redshift distribution of the faint 
galaxies, arcs occur at quite a range of angular separations from the cluster center; 
this effect will become even stronger for higher-redshift clusters. It should be noted 
that this 1 arcminute field does not cover the second mass clump seen with HST; 
an JWST image will cover a much larger area, and more strong lensing features 
will be found which can then be combined with the weak lensing analysis of such a 
cluster. For this simulation, a pixel size of 0'.'06 was used; the JWST sampling will 
be better by a factor of 2 (from Schneider & Kneib 1998) 
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of galaxy-matter correlations and biasing, and therefore provide important 
input into models of galaxy formation and evolution. 
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